A tabu search algorithm for cluster building in wireless sensor networks


Published on

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

A tabu search algorithm for cluster building in wireless sensor networks

  1. 1. IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 4, APRIL 2009 433 A Tabu Search Algorithm for Cluster Building in Wireless Sensor Networks Abdelmorhit El Rhazi and Samuel Pierre, Senior Member, IEEE Abstract—The main challenge in wireless sensor network deployment pertains to optimizing energy consumption when collecting data from sensor nodes. This paper proposes a new centralized clustering method for a data collection mechanism in wireless sensor networks, which is based on network energy maps and Quality-of-Service (QoS) requirements. The clustering problem is modeled as a hypergraph partitioning and its resolution is based on a tabu search heuristic. Our approach defines moves using largest size cliques in a feasibility cluster graph. Compared to other methods (CPLEX-based method, distributed method, simulated annealing-based method), the results show that our tabu search-based approach returns high-quality solutions in terms of cluster cost and execution time. As a result, this approach is suitable for handling network extensibility in a satisfactory manner. Index Terms—Wireless sensor network, energy map, data collect, clustering methods, tabu search. Ç1 INTRODUCTIONI NCREASINGLY, several applications require the acquisition of data from the physical world in a reliable and automaticmanner. This necessity implies the emergence of new kinds nodes have limited knowledge pertaining to their neighbor- hood. Hence, clusters are not built in an optimal manner. In [4], Ghiasi et al. propose centralized clustering forof networks, which are typically composed of low-capacity sensor networks. They model this problem as a k-meansdevices. Such devices, called sensors, make it possible to clustering problem, which is defined as follows [11]: let P be acapture and measure specific elements from the physical set of n data points in d-dimensional space Rd and an integerworld (e.g., temperature, pressure, humidity). Moreover, k, and the problem consists of determining a set of k pointsthey run on small batteries with low energetic capacities. in Rd , called centers, to minimize the mean squared distanceConsequently, their power consumption must be optimized from each data point to its nearest center. Heinzelman et al.in order to ensure increased lifetime for those devices. [7] propose a centralized version of Low Energy AdaptiveDuring data collection, two mechanisms are used to reduce Clustering Hierarchy (LEACH), their data collection proto-energy consumption: message aggregation and filtering of col, in order to produce better clusters by dispersing clusterredundant data. These mechanisms generally use clustering head nodes throughout the network. In this protocol, eachmethods in order to coordinate aggregation and filtering. node sends information regarding its current location and Clustering methods belong to either one of two cate- energy level to the sink node, which computes the node’sgories: distributed and centralized. The centralized ap- mean energy level, and nodes, whose energy level is inferiorproach assumes that the existence of a particular node is to this average, cannot become cluster heads for the currentcognizant of the information pertaining to the other net- round. Considering the remaining nodes as possible clusterwork nodes. Then, the problem is modeled as a graph heads, the sink node finds clusters using the simulatedpartitioning problem with particular constraints that render annealing algorithm [1] in order to find optimal clusters.this problem NP-hard. The central node determines clusters This algorithm attempts to minimize the amount of energyby solving this partitioning problem. However, the major required for noncluster head nodes to transmit their data todrawbacks of this category are linked to additional costs the cluster head, by minimizing the sum of squaredengendered by communicating the network node informa- distances between all noncluster head nodes and the closesttion and the time required to solve an optimization cluster head.problem. In the second category, the distributed method, The energy map, the component that holds informationeach node executes a distributed clustering algorithm [7], concerning the remaining energy available in all network[14], [15], [16]. The major drawback of this category is that areas, can be used to prolong the network’s lifetime [7]. In their probabilistic model for energy consumption, Heinzelman et al. [7] claim that each sensor node can be. The authors are with the Mobile Computing and Networking Research modeled by a Markov chain. They provide an equation that Laboratory (LARIM), Department of Computer Engineering, Ecole Polytechnique de Montre ´al, C.P. 6079, Station succ. Centre-ville, can be used by each node to calculate its energy dissipation ´al, Montre QC H3C 3A7, Canada. rate, ET , for the next T time steps. With the remaining E-mail: {abdelmorhit.el-rhazi, samuel.pierre}@polymtl.ca. energy, the value ET can be sent to the sink node for energyManuscript received 13 Aug. 2007; revised 9 Apr. 2008; accepted 20 Aug. map building purposes.2008; published online 4 Sept. 2008. This paper proposes a new centralized clusteringFor information on obtaining reprints of this article, please send e-mail to:tmc@computer.org, and reference IEEECS Log Number TMC-2007-08-0243. mechanism equipped with energy maps and constrainedDigital Object Identifier no. 10.1109/TMC.2008.125. by Quality-of-Service (QoS) requirements. Such a clustering 1536-1233/09/$25.00 ß 2009 IEEE Published by the IEEE CS, CASS, ComSoc, IES, SPS
  2. 2. 434 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 4, APRIL 2009mechanism is used to collect data in sensor networks. The measured frequencies (e.g., the sensor system must record afirst original aspect of this investigation consists of adding measurement every 15 minutes), in terms of a measurementthese constraints to the clustering mechanism that helps the discrepancy thresholds (e.g., the sensor system must ignoredata collection algorithm in order to reduce energy con- data whose result is less than 10 percent of the previoussumption and provide applications with the information value), and in terms of the sensor lifetime (e.g., measure-required without burdening them with unnecessary data. ments must be provided for one year).Centralized clustering is modeled as hypergraph partition- In [3], we propose a novel data collection approach foring. The novel method proposes the use of a tabu search sensor networks that use energy maps and QoS require-heuristic to solve this problem. The existing centralized ments to reduce power consumption while increasingclustering methods cannot be used to solve this issue due to network coverage. The mechanism comprises two phases:the fact that our approach to model the problem assumes during the first phase, the applications specify their QoSthat the numbers of clusters and cluster heads are unknown requirements regarding the data required by the applica-before clusters are created, which constitutes another major tions. They send their requests to a particular node S, calledoriginal facet of this paper. the collector node, which receives the application query and The remainder of this paper is organized as follows: obtains results from other nodes before returning them toSection 2 summarizes the data collection mechanism. the applications. The collector node builds the clusters,Section 3 outlines the problem formula. Section 4 describes optimally using the QoS requirements and the energy mapthe tabu search adaptation. Computational experiments and information. During the second phase, the cluster headsresults are reported in Section 5. Section 6 concludes this must provide the collector node with combined measure-paper and delineates some of the remaining challenges. ments for each period. The cluster head is in charge of various activities: coordinating the data collection within its cluster, filtering redundant measurements, computing2 DATA COLLECTION MECHANISM aggregate functions, and sending results to a node collector.Generally, sensor networks contain a large quantity ofnodes that collect measurements before sending them to the 3 PROBLEM FORMULATIONapplications. If all nodes forwarded their measurements,the volume of data received by the applications would The considered network contains a set V of m stationaryincrease exponentially, rendering data processing a tedious nodes whose localizations are known. The communication model can be described as multihop, which means thattask. A sensor system should thus contain mechanisms that certain nodes cannot send measurements directly to theallow the applications to express their requirements in collector node: they must rely on their neighbors’ service.terms of the required quality of data. Data aggregation and An application can specify the following QoS requirements:data filtering are two methods that reduce the quantity ofdata received by applications. The aim of those two 1. Data collection frequency, fq. The network providesmethods is not only to minimize the energy consumption results to the application every time the duration fqby decreasing the number of messages exchanged in the expires.network but also to provide the applications with the 2. A measurement uncertainty threshold, mut. If theneeded data without needlessly overloading them with difference between two simultaneous measurementsexorbitant quantities of messages. from two different nodes in the same zone (fourth The aggregation data mechanism allows for the gathering requirement) is inferior to mut, then one of them isof several measures into one record whose size is less than considered redundant.the extent of the initial records. However, the result 3. A query duration, T . The network required for thesemantics must not contradict the initial record semantics. query run a total time whose value is equal to T .Moreover, it must not lose the meanings of the initial 4. A zone size step. The step value determines the zonerecords. The data filtering mechanism makes it possible to length. Within a single zone, measurements are considered redundant. If an application requiresignore measurements considered redundant or those irrele- more precision, it could decrease the step value orvant to the application needs. A sensor system provides the even ignore the transfer of such value.applications with the means to express the criteria used todetermine measurement relevancy, e.g., an application The goal of the clustering algorithm is to 1) split the network nodes into a set of clusters Gi that satisfies thecould be concerned with temperatures, which are 1) lower application requirements, 2) reduce energy consumption,than a given value and 2) recorded within a delimited zone. and 3) prolong the network lifetime. Clusters are builtThe sensor system filters the network messages and according to the following criteria:forwards only those that respect the filter conditions. Applications that use sensor networks are generally . Maximize network coverage using the energy map;concerned with the node measurements within a certain . Gather nodes likely to hold redundant measurements;period of time. Hence, the most important key indicators in . Gather nodes located within the same zone delim-sensor networks are the quality of the measurements and ited by the application.the network lifetime. An application designed to record the Based on those criteria, a cluster building problem (CBP)mean temperature in zones where the sensors are deployed in the remainder of this paper consists of determining thecould be associated with a set of requirements in terms of set Gi that fulfills the following conditions:
  3. 3. EL RHAZI AND PIERRE: A TABU SEARCH ALGORITHM FOR CLUSTER BUILDING IN WIRELESS SENSOR NETWORKS 435 1: ð[ Gi ¼ V Þ ^ ð Gi ¼ Þ; ð1Þ a cluster Gj should reflect this objective. The cost should be i i
  4. 4. composed of two major terms. The first one represents the 2: 8Nj ; Nl 2 Gi
  5. 5. Mj ðtÞ À Ml ðtÞ
  6. 6. mut; ð2Þ energy consumption due to the cluster head duties. Indeed, 3: 8Nj ; Nl 2 Gi dj;l step; ð3Þ it is responsible for data collection and aggregation, as well T remaining 4: 9Nj 2 Gi Ej Ej : ð4Þ as the transmission of the measures of its cluster. The second term represents the energy gained due to the fact Here, Gi consists of a cluster that contains a set of nodes that messages of other cluster nodes will be filtered by theand a particular node that represents the cluster head. Mj ðtÞ data collection mechanism.represents Nj node reading during the time slot t; dj;l T Generally, the energy consumed by a node for a singlecorresponds to the distance between nodes Nj and Nl ; Ej is cycle is expressed as follows [8]:equal to the estimated energy dissipation of node Nj during remainingperiod T ; Ej illustrates the energy remaining in node Ecycle ¼ ED þ ES þ ET þ ER ; ð11ÞNj when the cluster building algorithm starts running. The first condition makes it possible to structure nodes where ED , ES , ET , and ER represent the energy required forinto disjoined sets. The second condition permits the data processing, sensing, transmitting, and receiving pergathering of nodes that are likely to record redundant cycle time, respectively. The quantity of energy spent formeasurements that will be filtered by the network nodes. each operation depends on the network and the eventThis condition can be verified using a current node measure- model. Roughly, (11) is approximated by the preponderantment or previously taken mean measurements. In this case, term that is ET [9]. When sending 1 bit from node u to v, thethe sensor node must be able to store measurement means. energy consumed is expressed as follows [2]:The third condition compels the gathering of nodes located in Euv ¼ Etxelec þ amp d uv ! 2:0: ð12Þzones whose lengths are determined by the step value that isgenerated by the application. The fourth condition ensures Here, factor indicates the path loss exponent and reliesthat each cluster contains at least one node that guarantees the on the communication channel as well as environmentalcoverage of the entire zone during the query run time. conditions. Etxlec depicts the energy dissipated by the CBP is considered as a hypergraph partitioning problem. electronic transmitter, and amp denotes a constant parameterThe network nodes are modeled on a hypergraph with a that characterizes the transmitter amplification.vertex set V ¼ f1; . . . ; mg. The arcs belong to a set of The communication model considered consists of aclusters G ¼ fG1 ; . . . ; Gn g. Additionally, let us define multihop model. Consequently, to express the energy consumed by communicating node i measurements to sink . ci : the cost of cluster Gi ; node s, the equation must consider the communication . dj;s : the distance between node j and the collector links between each pair of nodes found on the path between node s; . ai;j ¼ 1: if cluster Gj contains node i, 0 otherwise; i and s. However, such a complex equation would be . ej ¼ 1: if the energy level of node j satisfies difficult to compute as it depends on the routing protocol. Ej T Ejremaining , 0 otherwise. For this reason, only the distance that separates nodes i and s is considered. This simplification is justified by the fact The binary decision variables are given as follows: that, in a dense network, routing protocols can find a path . xj ¼ 1 if cluster j is used for partitioning, 0 otherwise. between i and s, which is similar to a line joining i and s on Hence, the CBP formulation can be expressed as the one hand. On the other hand, this simplification is also ! valid in single-hop communication models. X n Consequently, the cost cj of cluster Gj is expressed as minimize c j xj ð5Þ follows: j¼1 nj X subject to cj ¼ d À
  7. 7. d ; ð13Þ j;s i;sXn i¼1 aij xj ¼ 1; for i ¼ 1; . . . ; m; ð6Þ where nj indicates the number of nodes included inj¼1 cluster Gj , which is not currently a cluster head. dj;sxj 2 f0; 1g; for j ¼ 1; . . . ; n; ð7Þ represents the distance between the cluster head Gj andjMb ðtÞ À Mc ðtÞj mut; 8b; c 2 Gj ; for j ¼ 1; . . . ; n; ð8Þ sink node; and
  8. 8. reflect two positive coefficients. The firstdb;c step; 8b; c 2 Gj ; for j ¼ 1; . . . ; n; ð9Þ term of (13) represents the estimated energy consumed by X m the cluster head to communicate the collective measure- aij ei ! 1; for j ¼ 1; . . . ; n: ð10Þ ments. The second term represents the total energy savedi¼1 by cluster Gj nodes by filtering their messages. Pn Equation (6) ensures that each node is included in a The problem thus consists of minimizing Pnj j¼1 ðdj;s Àsingle cluster; this is called a partitioning constraint.
  9. 9. i¼1 di;s Þxj given (6)-(10). When formulated that way, theEquations (8), (9), and (10) represent the cluster building problem is considered NP-hard since it is modeled as acriteria (2), (3), and (4), respectively. partitioning problem with additional constraints and it is The objective of a cluster building phase is to minimize known that the partitioning problem is NP-hard [6].energy dissipation when collecting node data. The cost cj of Consequently, CBP cannot be resolved by a polynomial
  10. 10. 436 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 4, APRIL 2009 adjacent nodes in graph Gr , which have yet to be covered by F0 . This clique is considered a new cluster and node i becomes the cluster head. The algorithm does not ensure that all nonactive nodes are assigned to a cluster. Consequently, if node i is not covered by any cluster when the algorithm ends, it is assigned to a cluster whose head is adjacent to node i. However, this leads to the fact that an initial solution could not be feasible, i.e., nodes made up of at least one cluster does not consist of a clique in the graph Gr . A penalty equation to evaluate a solution is proposed in the following sections. 4.2 The Neighborhood NðsÞ Definition The definition of the neighborhood NðsÞ of a solution s is aFig. 1. Initial solution algorithm. crucial step as it determines the final quality of the solution and has a direct impact on the execution time. Two types ofalgorithm. This explains why a tabu search heuristic was moves are distinguished: the first move involves anadopted in order to find the best solution. ordinary node, i.e., a nonactive node, and the second move involves an active node. This is due to the fact that an active4 A TABU SEARCH APPROACH node could be a cluster head and thus build a new cluster. Furthermore, the third move that involves a cluster headIn order to facilitate the usage of tabu search for CBP, a new and allows removing an existing cluster from a solution isgraph called Gr is defined. It is capable of determining also considered.feasible clusters. A feasible cluster consists of a set of nodesthat fulfill the cluster building constraints (8), (9), and (10). 1. A Move Involving a Regular Node. Let s representNodes that satisfy Constraint (10), i.e., ensure zone cover- the solution analyzed for each iteration. Solution sage, are called active nodes. The vertices of Gr represent the consists of a set of clusters. Let a be a regular node.network nodes. An edge ði; jÞ is defined in graph Gr The first type of moves mðs; aÞ is defined as follows:between nodes i and j if they satisfy Constraints (8) and (9). assume that node a is assigned to cluster Gi in s.Consequently, it is clear that a clique in Gr embodies a Move mðs; aÞ assigns node a to another clusterfeasible cluster. A clique consists of a set of nodes that are Gj ðGi 6¼ Gj Þ whose head is adjacent to node a andadjacent to one another. removes it from cluster Gi . Five steps should be conducted in order to adapt tabu 2. A Move Involving an Active Node. This secondsearch heuristics to solve a particular problem: type of move relies on the fact that an active node could be a cluster head. Let a represent an active design an algorithm that returns an initial solution, 1. node. Move mðs; aÞ consists of define moves mðÁÞ that determine the neighborhood 2. NðsÞ of a solution s, a. Reassigning node a to a cluster whose head is 3. determine the content and size of tabu lists, adjacent to a. This move is similar to those of the 4. define the aspiration criteria, first type, since an active node could be included 5. design intensification and diversification in a cluster without becoming its head. mechanisms. b. Select node a to become the head of the cluster to which it is assigned. The previous cluster head The algorithm ends when one of the following three will still be assigned to this cluster, although it isconditions occurs: not the cluster head. Consequently, the cost of 1. All possible moves are prohibited by the tabu lists; this cluster will be affected as it relies on the 2. The maximal number of iterations allowed has been head coordinates. reached; 3. A Move Involving a Cluster Head. The third type of 3. The maximal number of iterations, where the best move involves the head of an existing cluster. Let a solution is not enhanced successively, has been be a head of cluster Gi , which is empty, i.e., Gi reached. contains only its head a. Move mðs; aÞ consists of removing cluster Gi from solution s and assigning4.1 Initial Solution node a to a cluster whose head is adjacent to a.The goal is to find an appropriate initial solution for the These three types of moves engender a variety ofproblem, in order to get the best solution from tabu search solutions by producing several combinations of clustersiterations within a reasonable delay. The algorithm depicted and cluster heads. Also, they create solutions whose sizesin Fig. 1 is proposed. It starts sorting active nodes according vary. However, they can produce clusters that do notto their degree in graph Gr decreasingly. For each iteration, necessarily consist of a clique since node a is reassignedthe first active node i, not yet covered by the initial solution without verifying whether the resulting cluster is a cliqueF0 , is selected. The algorithm determines the largest size or not. A cluster penalty is defined and added to theclique that contains the selected active node i with its cluster cost in order to compare NðsÞ solutions. Node a
  11. 11. EL RHAZI AND PIERRE: A TABU SEARCH ALGORITHM FOR CLUSTER BUILDING IN WIRELESS SENSOR NETWORKS 437penalty Pa assigned to cluster Gi represents the number of is initialized by the pairs that represent the initial solutionnodes in Gi , which is not adjacent to node a. Conse- before starting the iterations. Such a strategy prevents thequently, function f 0 makes it possible to compare the return to the initial solution.elements of neighborhood NðsÞ. It is expressed as follows: Using a tabu list could drastically restrict the neighbor- hood NðsÞ. Moreover, it could miss certain attractive jsj X solutions. Consequently, tabu search methods allow the f 0 ðmðs; ÞÞ ¼ cj xj þ P : ð14Þ violation of tabu list rules through the definition of an j¼1 aspiration criterion. Our proposal opts for the most used In this formula, , called the penalty coefficient, happens aspiration criterion, which consists of considering a moveto be positive. More details will be provided further in this inventoried in the tabu list, which in turn, engenders apaper. jsj represents the number of clusters in solution s. solution that is superior to the best solution found in theFunction f 0 ðmðs; aÞÞ contains two parts: the first is equal to first place.the total of cluster costs and the second term represents thepenalty caused by the move. For each iteration, this function 4.4 Diversification and Intensificationis used to compare NðsÞ solutions. To perform this Diversification and Intensification are two mechanisms thatevaluation quickly, it suffices to calculate the gain of a make it possible to improve tabu search methods. They startsolution in NðsÞ compared to solution s without recalculat- by analyzing the appropriate solutions visited and obtaining the value of f 0 for each iteration. This is possible since their common properties in order to be able to intensify themove mðs; aÞ affects only a maximum of two clusters. search in another neighborhood or to diversify the searches.Hence, we define the gain Gainðmðs; aÞÞ associated with In tabu searches, this mechanism is called long-term memory.move mðs; aÞ as follows: Our proposal uses a technique called the shifting penalty tactic, which is an instance of a procedure called strategic oscillation, 1. If move mðs; aÞ consists of reassigning node a from representing one of the basic diversification approaches for cluster Gi to cluster Gj , the gain is tabu searches [5]. The approach consists of directing the search toward and away from selected boundaries of Gainðmðs; aÞÞ ¼ ðPaj À Pai Þ: ð15Þ feasibility, either by manipulating the objective function Paj represents the penalty for assigning node a to (e.g., with penalties or incentives, as the case may be) or cluster Gj . The cluster costs disappear from (15) as simply by compelling the choice of move that leads the search they will be mutually neutralized. in specified directions. In this particular case, the penalty 2. If move mðs; iÞ consists of selecting node i as the coefficient is determined dynamically, according to the head of the cluster Gj , where it is assigned, the gain violation of the clique constraint. Thus, if the solution is expressed as follows: selected at the end of an iteration contains a cluster with nonnull penalties, i.e., the cluster does not consist of a clique, Gainðmðs; iÞÞ ¼ d À
  12. 12. d À d À
  13. 13. d : i;s t;s t;s i;s ð16Þ the penalty coefficient value is increased and vice versa. Node t denotes the current head of cluster Gj . The result is obvious if we consider (13) and (14). The 5 COMPUTATIONAL EXPERIENCE AND RESULTS penalty values are identical as node i does not In order to evaluate the performance of these novel change clusters. algorithms, they were implemented with the use of C++ and the Boost Graph Library (BGL) [10] and tested with4.3 Tabu List and Aspiration Criteria sensor networks of different sizes and topologies. On theOccasionally, tabu search methods accept solutions that do one hand, several experiments were conducted to evaluatenot improve the objective function, in the hope of reaching the impact of the tabu search method parameters. On theimproved solutions in the future. However, accepting other hand, another approach was devised to solve thesolutions that are not necessarily optimal introduces a cycle partitioning problem based on CPLEX [18] in order torisk, i.e., a return to previously considered solutions, hence compare the quality of the solutions found by tabu search.the idea of keeping a tabu list, to keep track of the solutions This new method had to be designed and implemented asthat have been considered in the past. Thus, when the existing approach for clustering problems cannot begenerating the neighborhood candidates, the solutions that used in this case, as explained at the beginning of thisappear in the tabu list are removed. Our adaptation paper. The algorithms run on a 1.80-GHz Pentium 4,proposes two tabu lists: a reassignment list and a reelection equipped with a Linux server (Red Hat 3.4.4-2), a 1-Gbytelist. The first tabu list prevents cycles that can be generated memory, and an Intel processor.by the reassigning of a node to the same cluster. After eachmove mðs; aÞ, which consists of reassigning node a to 5.1 Analysis of the Impact of Tabu Searchcluster Gi , the pair (a, head of Gi ) is added to this tabu list. ParametersThe second tabu list prevents the reelection of an active The size of the tabu list has a direct impact on the quality ofnode in the same cluster. After a move mðs; aÞ, consisting of the solution. Hence, it is important to analyze its impact, inelecting node a in cluster Gi , two pairs of nodes are added order to adjust its value accordingly. Results reported in [5]to the reelection list: the first pair (a, head of Gi ) prohibits show that determining the tabu list size dynamically isthe move mðs; aÞ and the second pair (head of Gi , a) more efficient than fixing its value during the iterations.prevents the reverse move. The reassignment of the tabu list This experiment involves a sensor network composed of
  14. 14. 438 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 4, APRIL 2009Fig. 2. Impact of the tabu list size on the solution cost.Fig. 3. Impact of the maximum number of iterations on the solution cost.100 nodes. A square topology is used, i.e., nodes arise on the of the tabu list dynamically within the interval ½0:75N; 1:1NŠsummits of squares that cover the entire network area. To (N indicates the number of nodes).facilitate the analysis, it is assumed that all nodes are active. A second experiment was conducted in order to quantifyOnly the size of the tabu list varies, and the values of all the impact of the parameter “maximum number of allowedother parameters are set, e.g., the maximum number of iterations” on the quality of the solution found anditerations allowed is set to 1,000. Fig. 2 illustrates the results determine the limits of the solution enhancement. A sensorof the investigation. network that comprises 1,000 nodes localized in a square Results corroborate with those described in [5]. Indeed, topology is considered. The maximum number of iterationsthe best results are obtained using tabu lists whose size is is modified and the other parameters are set except for thesimilar to the number of nodes, i.e., between 75 and 120.Results associated with a tabu list of large values hinder the “maximum number of iterations where the best solution isquality of the solution as they reduce the number of not enhanced” parameter, which takes a value that allowselements visited by interdicting additional moves. More- the algorithm to reach the maximum number of iterations.over, low values also hinder quality due to the cycle Fig. 3 shows the impact of this parameter on the percentagegeneration. Thus, we determine that it is best to set the size of the improvement of the initial solution cost.
  15. 15. EL RHAZI AND PIERRE: A TABU SEARCH ALGORITHM FOR CLUSTER BUILDING IN WIRELESS SENSOR NETWORKS 439Fig. 4. Impact of the maximum number of iterations on the execution time. Results show that by increasing the maximum number of clusters is defined, using the graph Gr . The second step,iterations, the solution costs is enhanced; hence, higher called the optimization phase, makes it possible to returnquality solutions are obtained. This occurs until this optimal covering of the hypergraph using CPLEX. The lastparameter reaches a value where solutions have almost step, called the postoptimization phase, aims to use the resultsidentical cost in spite of the increasing number of iterations. of the previous step to find hypergraph partitioning. TheIndeed, between 0 and 5,000 iterations, the solution quality experiments run on the server described in the previousis enhanced by a factor of 2.3. However, between 5,000 and section, using CPLEX 10.1 and considers square topology10,000 iterations, such enhancement is evident only sensor networks in which all nodes are active and QoS1.35 times. This is mainly due to the influence of the other parameters are set (i.e., the values of step and mut). Theparameters and the initial solution’s algorithm, which N number of nodes varies and the maximum number ofyields a satisfactory solution. iterations is set at 8ÃN. The analyses regarding the impact of the maximal Fig. 5 illustrates the solution costs when varying thenumber of iterations on the execution time and the quantity number of nodes for both methods (tabu search andof moves yield the same results. Fig. 4 shows such an impact. CPLEX). It reveals that the solutions found by these twoWithin the range of 800 to 5,000 iterations, the execution time methods are rather similar except for cases where theincreases 10-fold and the quality of the solution is enhanced number of nodes is superior to 700 nodes, at which point theefficiently, i.e., solution costs are reduced by 40 percent. tabu search solutions generate better results. This is due toThus, we conclude that the quality of the solution is the use of the postoptimization phase in the second method,enhanced by increasing the maximum number of iterations which does not necessarily return an optimal solution. Also,until a value is reached where the execution time increases the first step limits the considered solution sets.without enhancing the solution efficiently. This value, which Fig. 6 illustrates the execution time as the number ofwe will be using in the following experiments, is equal to nodes for both methods varies. It also reveals that theapproximately eight times N. execution times of both methods are highly similar for networks of fewer than 700 nodes. However, for this size,5.2 Comparing Tabu Search Approach with the execution time of the method based on CPLEX increases CPLEX-Based Method significantly and the execution time of the tabu searchAs stated previously, none of the analyzed clustering increases reasonably. This is mainly due to the fact thatmechanisms can be compared with our approach. Conse- simplex, used by CPLEX in this case, consists of anquently, we devised and implemented a second method to exponential method.solve the CBP clustering problem. The objective is to Consequently, the results of this experiment show that,compare the quality of the solution found with the first on the one hand, the solutions returned by our methodapproach. This new method is based on CPLEX to solve the based on tabu search are associated with high quality inproblem. CPLEX could not be used to directly solve the terms of cluster costs and execution time. On the otherpartitioning problem, since the number of feasible clusters hand, this approach behaves well with the networkis quite significant and CBP contains saturated constraints. extensibility, i.e., the execution time increases in a satisfac-For these reasons, this new method finds an optimal tory manner as network size augments.covering of the hypergraph and uses this covering to find The aforementioned experiments were conducted usingthe best partitioning of the hypergraph. Indeed, the method a square topology. In order to validate our results withcontains three steps. In the first step, a set of feasible other topologies, the same study was conducted using a
  16. 16. 440 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 4, APRIL 2009Fig. 5. Comparing the solution costs (tabu search and CPLEX): Square topology.Fig. 6. Comparing execution time (tabu search and CPLEX): Square topology.random topology where nodes are randomly localized over Such investigations allow for the validation of thea 1-km2 surface. The same parameter values as in square results obtained using square topologies. On the onetopology are used, e.g., the same value of step and mut, and hand, the solutions returned by the method based on tabuthe maximum number of iterations allowed is set at 8ÃN. search provide high quality in terms of cluster cost andNode localization is distributed in a uniform manner. execution time. On the other hand, the method based onRandom reel numbers are generated using the Boost library tabu search behaves well with network extensibility. This[17]. Node density varies from 20 to 900 nodes/km2 . also applies to networks that support square or random Fig. 7 shows solution costs when varying the node topologies.density for the method based on tabu search and themethod based on CPLEX. Results are similar to those 5.3 Comparing Centralized Approach, Distributedobtained with the square topology. Indeed, the solution Approach, and TAGcosts for both methods are almost identical. Analyzing the We devise the centralized approach for the CBP because weexecution times of both methods reveals the same conclu- noticed from our previous experiences described in [3] thatsion. Fig. 8 illustrates the execution time as the node density the distributed approach does not ensure an optimalvaries for both methods. Results show that the execution solution. We conducted other experiences in order totime increases slightly for the tabu search-based method, evaluate the difference between the performances of thecontrary to the CPLEX-based method, where the execution centralized and distributed approaches.time augments drastically. This is justified with the same Fig. 9 depicts solution costs when varying the networkreasons as the square topology. size for the centralized and distributed approaches. Results
  17. 17. EL RHAZI AND PIERRE: A TABU SEARCH ALGORITHM FOR CLUSTER BUILDING IN WIRELESS SENSOR NETWORKS 441Fig. 7. Comparing solution costs (tabu search and CPLEX): Random topology.Fig. 8. Comparing execution time (tabu search and CPLEX): Square topology.show that effectively the costs of the clusters built using the other hand, the nodes in the centralized approach have tocentralized approach are better than those of the clusters send their information to a central node that collects all ofbuilt using the distributed approach. Also, the difference these information and runs the algorithm to build thebetween the two costs becomes bigger when the network clusters. The energy consumed by sending and receivingsize increases. For example, the distributed approach allows these messages should not be neglected. Consequently, weincreasing the quality of the clusters by 500 percent in a conducted experiences to measure the energy consumed bynetwork with 100 nodes. Consequently, the clusters built by the two approaches in order to build the clusters.the centralized approach are better than those created by In the centralized approach, the central node needs thethe distributed approach. The reason that explains this following information in order to run the algorithm of clusterresult is that in the distributed approach the nodes have building: 1) the coordinates of each node in order to build theinformation limited to their node in their neighbor, whereas graph Gr and calculate the cluster costs, 2) the measurementthe centralized approach finds the better cluster among all of each node in order to build the graph Gr , and 3) the activethe possible combination. flag of each node (i.e., the flag has to indicate whether In order to the build the clusters, the two approaches the node is able to cover its zone or not).need to communicate messages between the network We conducted several simulations using OMNET++ [19].nodes. Indeed, on the one hand, in the distributed We use the multihop network communication model.approach the nodes have to send the message that allows Fig. 10 shows the energy consumed to build the clusterscreating the clusters (e.g., informing the other nodes with by the centralized and distributed approaches. Resultsregard to the QoS required by the applications); on the show that the distributed approach needs less energy
  18. 18. 442 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 4, APRIL 2009Fig. 9. Comparing solution cost (distributed and centralized approaches).Fig. 10. Comparing the energy consumption to build the clusters (centralized and distributed approaches).consumption than the centralized approach and the gap much less steep than the one associated with the TAGbetween these energies becomes bigger when the network algorithm, which means that the node number increase hassize increases. The reason behind this result is that the less impact on our approach as compared to TAG. This iscentral node needs to generate a considerable number of due to the fact that our system can filter data sensed bymessages in order to collect all the node information. We nodes within the same zone much better than in TAG. Theconclude that the central approach is less efficient than the second conclusion is that, when the data collection phase isdistributed approach in the cluster building phase. considered, the performance of the central approach is We conducted other simulations in order to compare better than the distributed approach due to the fact thatthe total consumed energy (i.e., energy consumed by the cluster building is more efficient in the centralbuilding algorithms and the energy consumed during the approach.data collection phase) for the central and distributedapproaches. The results of our approaches are compared 5.4 Comparing Tabu Search-Based to Simulatedto those generated by an existing data collection algorithm, Annealing-Based Approachescalled TAG [13]. This algorithm was chosen due to its Simulated annealing is a probabilistic algorithmic approachimportance in sensor networks. It is actually used for data to solve optimization problems. Kirkpatrick et al. [12] use itcollection in TinyOS, which is the most common operating to solve combinatorial optimization problems. Simulatedsystem in sensor networks. Fig. 11 illustrates the total annealing allows for a given optimization problem toenergy consumption as network sizes vary. For the three accept solutions that degrade cost; even if later, suchalgorithms, energy consumption increases proportionally accepted solutions will be ignored when they fail toto the number of nodes, a natural phenomenon. However, improve the best solution. Simulated annealing decidesthe slope of the curve that represents our approaches is whether to reject or accept a solution that degrades costs
  19. 19. EL RHAZI AND PIERRE: A TABU SEARCH ALGORITHM FOR CLUSTER BUILDING IN WIRELESS SENSOR NETWORKS 443Fig. 11. Comparing the energy consumption (centralized, distributed, and TAG approaches).Fig. 12. Comparing the solution cost (tabu search and simulated annealing).randomly. The same query and topology were used to use the largest size clique in a feasibility cluster graph,compare the results obtained by tabu search and by which facilitates the analysis of several solutions and makessimulated annealing. Fig. 12 presents a comparison with it possible to compare them using a gain function.simulated annealing. Generally, the tabu search algorithm The performance of this novel approach was evaluatedperforms better than the simulated annealing algorithm. with different network sizes and topologies. The perfor- mance is compared to that obtained by a second resolution CPLEX-based method, a third approach based on simulated6 CONCLUSION annealing heuristic, and an existing algorithm (TAG).This paper has presented a heuristic approach based on a Finally, results show that a tabu search-based resolutiontabu search to solve clustering problems where the numbers method provides quality solutions in terms of cluster costof clusters and cluster heads are unknown beforehand. To and execution time. Furthermore, it behaves well withour knowledge, this is the first time that the clustering network extensibility. Nevertheless, compared to a distrib-problem is modeled and resolved with these constraints. uted approach, this centralized approach suffers from aThe tabu search adaptation consists of defining three types major drawback linked to the additional costs generated byof moves that allow reassigning nodes to clusters, selecting communicating the network node information and the timecluster heads, and removing existing clusters. Such moves required to solve an optimization problem.
  20. 20. 444 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 4, APRIL 2009 We conducted several experiences to compare the Abdelmorhit El Rhazi received the bachelor’s degree in software engineering from the Ecoleperformance of our central approach with those of a ´ Mohammadia d’Ingenieur, Rabat, Morocco, indistributed approach and we conclude that the central 1995 and the master’s degree in computerapproach is less efficient than the distributed approach in engineering from the Ecole Polytechnique dethe cluster building phase. Nevertheless, the central ´ Montreal in 2003. He is currently with the Mobile Computing and Networking Research Labora-approach is more efficient in the data collection phase. tory (LARIM), Department of Computer Engi-Consequently, the central approach is more efficient in a ´ neering, Ecole Polytechnique de Montreal. Hiscase where the data collection phase is long. Otherwise, the work revolved around data collection and the energy consumption in wireless sensor networks.distributed approach should be used to run the queries witha short execution time. Samuel Pierre is currently a professor of computer engineering at Ecole Polytechnique ´ de Montreal, where he is the director of theREFERENCES Mobile Computing and Networking Research[1] P.K. Agarwal and C.M. Procopiuc, “Exact and Approximation Laboratory (LARIM) and an NSERC/Ericsson Algorithms for Clustering,” Algorithmica, vol. 33, no. 2, pp. 201- industrial research chair in next-generation 226, June 2002. mobile networking systems. He is the author or[2] P. Basu and J. Redi, “Effect of Overhearing Transmissions on coauthor of six books, 15 book chapters, Energy Efficiency in Dense Sensor Networks,” Proc. Third Int’l 16 edited books, and more than 350 other Symp. Information Processing in Sensor Networks (IPSN ’04), pp. 196- technical publications including journal and 204, Apr. 2004. proceedings papers. He received the Best Paper Award from the Ninth[3] A. El Rhazi and S. Pierre, “A Data Collection Algorithm Using International Workshop in Expert Systems and Their Applications Energy Maps in Sensor Networks,” Proc. Third IEEE Int’l Conf. (France, 1989), the Distinguished Paper Award from OPNETWORK Wireless and Mobile Computing, Networking, and Comm. (WiMob ’07), 2003 (Washington), a special mention from Telecoms Magazine 2007. (France, 1994) for one of his coauthored books, Te ´communications ´le[4] S. Ghiasi, A. Srivastava, X. Yang, and M. Sarrafzadeh, “Optimal et Transmission de Donne ´es (Eyrolles, 1992), among others. His Energy Aware Clustering in Sensor Networks,” Sensors, pp. 258- research interests include wireline and wireless networks, mobile 269, 2002. computing, performance evaluation, artificial intelligence, and electronic[5] F. Glover, E. Taillard, and D. Werra, “A User’s Guide to Tabu learning. He is an associate editor of the IEEE Communications Letters, Search,” Annals of Operations Research, vol. 41, no. 14, pp. 3-28, May the IEEE Canadian Journal of Electrical and Computer Engineering, and 1993. the IEEE Canadian Review. He is also a regional editor of the Journal of[6] M. Gondran and M. Minoux, Graphes et Algorithmes, second ed. Computer Science. He also serves on the editorial board of Telematics Editions Eyrolles, 1985. and Informatics (Elsevier Science) and the International Journal of[7] W. Heinzelman, A. Chandrakasan, and H. Balakrishnan, “An Technologies in Higher Education (IJTHE). He is a fellow of the Application Specific Protocol Architecture for Wireless Micro- Engineering Institute of Canada, a member of the Canadian Academy of sensor Networks,” IEEE Trans. Wireless Comm., vol. 1, no. 4, Engineering, and a senior member of the IEEE. pp. 660-670, Oct. 2002.[8] J.J. Lee, B. Krishnamachari, and C.C.J. Kuo, “Impact of Hetero- geneous Deployment on Lifetime Sensing Coverage in Sensor . For more information on this or any other computing topic, Networks,” Proc. IEEE Sensor and Ad Hoc Comm. and Networks Conf. please visit our Digital Library at www.computer.org/publications/dlib. (SECON ’04), pp. 367-376, 2004.[9] W. Liang and Y. Liu, “Online Data Gathering for Maximizing Network Lifetime in Sensor Networks,” IEEE Trans. Mobile Computing, vol. 6, no. 1, pp. 2-11, Jan. 2007.[10] S. Jeremy, The Boost Graph Library: User Guide and Reference Manual. Addison-Wesley, 2002.[11] T. Kanugo, D.M. Mount, N.S. Netanyahu, C.D. Piatko, R. Silver- man, and A.Y. Wu, “A Local Search Approximation Algorithm for k-Means Clustering,” Proc. 18th Ann. ACM Symp. Computational Geometry (SoCG ’02), pp. 10-18, 2002.[12] S. Kirkpatrick, C.C. Gelatt Jr., and M.P. Vecchi, “Optimization by Simulated Annealing,” Science, vol. 220, pp. 671-680, 1983.[13] S.R. Madden, M.J. Franklin, J.M. Hellerstein, and W. Hong, “TAG: Tiny Aggregation Service for Ad-Hoc Sensor Networks,” Proc. Fifth Symp. Operating Systems Design and Implementation (OSDI ’02), pp. 131-146, 2002.[14] O. Moussaoui, A. Ksentini, M. Naimi, and M. Gueroui, “A Novel Clustering Algorithm for Efficient Energy Saving in Wireless Sensor Networks,” Proc. Seventh Int’l Symp. Computer Networks (ISCN ’06), pp. 66-72, 2006.[15] S. Raghuwanshi and A. Mishra, “A Self-Adaptive Clustering Based Algorithm for Increased Energy-Efficiency and Scalability in Wireless Sensor Networks,” Proc. IEEE 58th Vehicular Technology Conf. (VTC ’03), vol. 5, pp. 2921-2925, 2003.[16] O. Younis and S. Fahmy, “Distributed Clustering in Ad-Hoc Sensor Networks: A Hybrid, Energy-Efficient Approach,” Proc. IEEE INFOCOM, pp. 629-640, 2004.[17] http://boost.org/libs/random/index.html, Feb. 2007.[18] http://www.ilog.com/products/cplex/, Feb. 2008.[19] http://www.omnetpp.org/, Feb. 2008.