Transcript of "Localised multicast efficient and distributed replica detection in large scale sensor networks"
1.
IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 9, NO. 7, JULY 2010 913 Localized Multicast: Efficient and Distributed Replica Detection in Large-Scale Sensor Networks Bo Zhu, Member, IEEE, Sanjeev Setia, Sushil Jajodia, Senior Member, IEEE, Sankardas Roy, Member, IEEE, and Lingyu Wang, Member, IEEE Abstract—Due to the poor physical protection of sensor nodes, it is generally assumed that an adversary can capture and compromise a small number of sensors in the network. In a node replication attack, an adversary can take advantage of the credentials of a compromised node to surreptitiously introduce replicas of that node into the network. Without an effective and efficient detection mechanism, these replicas can be used to launch a variety of attacks that undermine many sensor applications and protocols. In this paper, we present a novel distributed approach called Localized Multicast for detecting node replication attacks. The efficiency and security of our approach are evaluated both theoretically and via simulation. Our results show that, compared to previous distributed approaches proposed by Parno et al., Localized Multicast is more efficient in terms of communication and memory costs in large-scale sensor networks, and at the same time achieves a higher probability of detecting node replicas. Index Terms—Wireless sensor networks security, node replication attack detection, distributed protocol, efficiency. Ç1 INTRODUCTIONA new set of security challenges arises in sensor networks due to the fact that current sensor nodes lack hardwaresupport for tamper-resistance and are often deployed in is compromised or the path to the base station is blocked, adversaries can add an arbitrary number of replicas into the network without being detected. Hence, a distributedunattended environments where they are vulnerable to solution is desirable.capture and compromise by an adversary. A serious Distributed approaches for detecting node replicationsconsequence of node compromise is that once an adversary are based on storing a node’s location information at one orhas obtained the credentials of a sensor node, it can more witness nodes in the network. When a new node joinssurreptitiously insert replicas of that node at strategic the network, its location claim is forwarded to thelocations within the network. These replicas can be used to corresponding witness nodes. If any witness receives twolaunch a variety of insidious and hard-to-detect attacks on different location claims for the same node identity (ID), itthe sensor application and the underlying networking will have detected the existence of a replica and can takeprotocols. This type of attack is called a node replication attack, appropriate actions to revoke the node’s credentials.which was first identified and studied by Parno et al. [14]. The basic challenge of any distributed protocol in In a centralized approach for detecting node replication, detecting node replicas is to minimize communicationwhen a new node joins the network, it broadcasts a signed and per node memory costs while ensuring that themessage (referred to as a location claim) containing its location adversary cannot defeat the protocol. A protocol thatand identity to its neighbors. One or more of its neighbors deterministically maps a node’s ID to a unique witnessthen forward this location claim to a central trusted party [4] node would minimize both communication costs and(e.g., the base station). With location information for all the memory requirements per node, but would not offernodes in the network, the central party can easily detect any enough security because the adversary would need topair of nodes with the same identity but at different locations. compromise just a single witness node in order to be able toLike all centralized approaches, however, this solution introduce a replica without being detected.is vulnerable to a single-of-point failure. If the base station Previously, Parno et al. [14] presented two distributed algorithms for detecting node replication in which the witness nodes for a node’s location information are. B. Zhu and L. Wang are with the Concordia Institute for Information randomly selected among all the nodes in the network. In Systems Engineering, Concordia University, 1515 Ste-Catherine Street West, Suite: EV007.639, Montreal, QC H3G 2W1, Canada. pﬃﬃﬃ Randomized Multicast algorithm each location has the E-mail: {zhubo, wang}@ciise.concordia.ca. n witness nodes. Thus, in a network of n nodes, according. S. Setia, S. Jajodia, and S. Roy are with the Department of Computer to the Birthday Paradox, in the event of a node replication Science, George Mason University, 4400 University Drive, Fairfax, VA attack, at least one witness node is likely to receive conflicting 22030. E-mail: {setia, jajodia}@gmu.edu, sankar.roy@gmail.com. location claims about a particular node. The communicationManuscript received 31 July 2008; revised 30 July 2009; accepted 17 Oct. costs of this protocol are Oðn2 Þ (for the entire network) and pﬃﬃﬃ2009; published online 23 Feb. 2010. the memory requirements per node are Oð nÞ. The Line-For information on obtaining reprints of this article, please send e-mail to:tmc@computer.org, and reference IEEECS Log Number TMC-2008-07-0301. Selected Multicast exploits the routing topology of theDigital Object Identifier no. 10.1109/TMC.2010.40. network to select witnesses for a node’s location and uses 1536-1233/10/$26.00 ß 2010 IEEE Published by the IEEE CS, CASS, ComSoc, IES, & SPS Authorized licensed use limited to: Asha Das. Downloaded on July 29,2010 at 11:55:49 UTC from IEEE Xplore. Restrictions apply.
2.
914 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 9, NO. 7, JULY 2010geometric probabilities to detect replicated nodes. It has a The general idea of centralized solutions was first pﬃﬃﬃcommunication cost of Oðn nÞ and memory requirements described in [14]. More specifically, each sensor’s location pﬃﬃﬃper node of Oð nÞ. information is forwarded toward a centralized trusted Recently, Conti et al. proposed another replica detection party, usually the base station, which takes the responsi-protocol, i.e., RED [2]. Compared to Parno et al.’s work [14], bility of identify repeated identities at distinct locations. Ain RED each location has a smaller number of witnesses. more concrete protocol (i.e., SET [1]) was later proposed byThe set of witnesses is uniformly chosen from the whole Choi et al., is based on the idea of computing set operationsnetwork due to the usage of a pseudorandom function, the (intersection and union) of exclusive subsets in the network.inputs of which include the identity of the node, the In SET, a distributed algorithm is performed to divide thenumber of locations (of witnesses) that have to be generated network into exclusive subsets and select subset leadersby any neighbor of this node that decides to forward the (SLDRs). Each exclusive set is securely formed among one-location claim, and a random number rand which is hop neighbors. Afterwards, in the basic scheme, each SLDRchanged per iteration. In other words, within each iteration, forwards a summarized report to the base station directly.the set of witnesses for any node is fixed and is known to In the subset-tree scheme, multiple subset trees, nodes ofanyone who has the knowledge of rand through either node which are SLDRs, are constructed. For each subset tree, acompromise or sniffing the broadcast message containing root SLDR aggregates reports from other leaf SLDRs, andthe value of rand at the beginning of each iteration. then forwards the final report to the base station. UponTherefore, there exists a dilemma in selecting an appro- receiving all the reports, the base station verifies the validity of the reports and detect node replicas.priate value of the number of locations (of witnesses) thathave to be generated so as to achieve the balance between Parno et al. [14] were the first to propose distributedefficiency and robustness against node compromise. algorithms for detecting node replication attacks in sensor In this paper, we present a novel distributed protocol for networks. The authors first described two preliminarydetecting node replication attacks that takes a different approaches, i.e., Node-to-Network Broadcasting and De-approach for selecting witnesses for a node. In our approach, terministic Multicast, and discussed their weaknesses.which we call Localized Multicast, the witness nodes for a Then, the Randomized Multicast and the Line-Selectednode identity are randomly selected from the nodes that are Multicast were proposed. In Sections 7 and 6.3, we havelocated within a geographically limited region (referred to as compared the performance and effectiveness of oura cell). Our approach first deterministically maps a node’s ID approaches to their schemes.to one or more cells, and then uses randomization within the Recently, Conti et al. proposed a new distributed protocol,cell(s) to increase the resilience and security of the scheme. called as RED [2], for detecting node replication attacks.One major advantage of our approach is that the probability Compared to Parno et al.’s work [14], RED has a smallerof detecting node replicas is much higher than that achieved memory overhead. In addition, since the set of witnesses isin Parno et al.’s protocols [14]. chosen uniformly within the network, RED is more robust We describe and analyze two variants of the Localized against selective node compromise, although has a slightMulticast approach: Single Deterministic Cell (SDC) and lower detection rate in terms of random node compromise.Parallel Multiple Probabilistic Cells (P-MPC), which as their Their scheme can be viewed as a variant of deterministicname suggests differ in the number of cells to which a multicast, which has a weakness in determining an appro-location claim is mapped and the manner in which the cells priate number of deterministic witness nodes that satisfiesare selected. We evaluate the performance and security of both security and efficiency requirements [14]. In RED, this weakness is mitigated through changing witness nodes forthese approaches both theoretically and via simulation. Our any given identity after each time interval, although they areresults show that the Localized Multicast approach is more deterministic within any time interval.efficient than Parno et al.’s algorithms in terms of commu- An attack that is superficially similar to node replicationnication and memory costs, while providing a high level of is the Sybil attack [3]. In this attack, single physicalcompromise-resilience. Further, our approach also achievesa higher level of security in terms of the capability of adversary can generate a number of virtual identities anddetecting node replicas. falsely claim to be a set of nonexistent nodes. Douceur [3] The rest of the paper is organized as follows: In Section 2, proposed the use of a few schemes in which the potentialwe review previous research work related to detecting node Sybil users are challenged to solve some resource-intensivereplication in sensor networks. In Section 3, the system, task that can only be accomplished by multiple real-worldnetwork, and adversary model of our work are presented. users but will be impractical for a Sybil source. In contrast,Then, we propose two variants of the Localized Multicast in node replication attacks, single adversary can generate aapproach in Section 4. Afterwards, the theoretic analysis on number of physical nodes with the same identity and putthe security and efficiency of the Single Deterministic Cell them at different locations in the network. In other words,scheme and the Parallel Multiple Probabilistic Cells scheme each replica is a real physical node, instead of a virtual one.are presented in Section 5 and Section 6, respectively. The As a result, the detection mechanism proposed in [3] fails tosimulation results are shown in Section 7. Finally, we draw detect node replication. In [13], Newsome et al. proposed aour conclusion in Section 8. few mechanisms for detecting Sybil attacks in sensor networks, among which only the centralized node registra- tion mechanism can be used to detect node replication.2 RELATED WORK This paper extends an earlier version of the work [18] inThe methods of detecting node replication can be divided important new ways. First, we add the discussion aboutinto two categories: centralized and distributed. potential attacks against SDC, e.g., blocking attacks. Second, Authorized licensed use limited to: Asha Das. Downloaded on July 29,2010 at 11:55:49 UTC from IEEE Xplore. Restrictions apply.
3.
ZHU ET AL.: LOCALIZED MULTICAST: EFFICIENT AND DISTRIBUTED REPLICA DETECTION IN LARGE-SCALE SENSOR NETWORKS 915security analysis upon the resilience against node compro- TABLE 1mise are revisited to provide a more accurate analysis. Last, Notation and Symbolsbut not the least, the security and efficiency conditions ofour approach are evaluated under different settings.3 PROTOCOL FRAMEWORKIn this section, we present the system, network, andadversary models assumed in our work, as well as thenotation and symbols used in the paper.3.1 System and Network ModelWe consider a sensor network with a large number of low-cost nodes distributed over a wide area. In our approach,we assume the existence of a trusted base station, and thesensor network is considered to be a geographic grid, eachunit of which is called a cell. Sensors are distributeduniformly in the network. New sensors may be added intothe network regularly to replace old ones. Each node is assigned a unique identity and a pair ofidentity-based public and private keys1 by an offline TrustAuthority (TA). In identity-based signature schemes like [6],the private key is generated by signing its public key(usually a hash on its unique identity) with a master secret We assume the existence of some monitoring mechanismheld only by the TA. In other words, to generate a new that can detect a node compromising operation with aidentity-based key pair, cooperation from the TA is a must. certain probability. We also assume that adversaries areTherefore, we assume that adversaries cannot easily create rational, and thus, may try to avoid triggering anysensors with new identities in the sense that they cannot automated protocol (e.g., SWATT [17]) that sweeps the network to remove compromised nodes, or drawing humangenerate the private keys corresponding to the identities attention or intervention while launching the attacks.claimed and thus fail to prove themselves to the neighborsduring the authentication of the location claims. 3.3 Notation Similar to [14], we require that, when a node is added In Table 1, we list the notation and symbols used in this paper.into the network, it needs to generate a location claim andbroadcast the claim to its neighbors. Each neighborindependently decides whether to forward the claim with 4 THE LOCALIZED MULTICAST APPROACH FORa given probability. For those neighbors that plan to DETECTING NODE REPLICATIONSforward the claim, they determine the destination cell(s) We have designed two variants of the Localized Multicastaccording to the output of a geographic hash function [15], approach, specifically Single Deterministic Cell (SDC) andwhich uniquely maps the identity of the sender of the Parallel Multiple Probabilistic Cells (P-MPC).location claim to one or a few of the cells in the grid. Then,the claim is forwarded to the destination cell(s) using a 4.1 Single Deterministic Cellgeographic routing protocol such as GPSR [7]. In the Single Deterministic Cell scheme, a geographic hash function [15] is used to uniquely and randomly map node L’s3.2 Adversary Model identity to one of the cells in the grid. For example, given thatIn this paper, we assume that the major goal of adversaries the geographic grid consists of a Â b cells, a cell at the a0 th rowis to launch node replication attacks. To achieve this goal, and the b0 th column (where a0 2 f1; . . . ; ag; b0 2 f1; . . . ; bg) iswe assume that adversaries may launch both passive uniquely identified as c (where c ¼ a0 Á b þ b0 ). By using a one-attacks (e.g., eavesdropping on network traffic) and active way hash function HðÞ, node L is mapped to a cell C, whereattacks (e.g., modifying and replaying messages or com- c ¼ ½HðIDL Þmodða Á bÞ þ 1.promising sensors), and the information obtained from the The format of the location claim isformer can be used to enhance the effectiveness of the latter. ½IDL ; lL ; SIGSKL ðHðIDL klL ÞÞ;For example, by sniffing the traffic, adversaries may deducecertain information about the witness nodes, which could where k denotes the concatenation operation and lL is thehelp them evaluate the potential benefit of compromising a location information of L, which can be expressed usinggiven node and the risk of being detected while launching either the two-dimension or three-dimension coordinate.the node replication attack at a given location. When L broadcasts its location claim, each neighbor first verifies the plausibility of lL (e.g., based on its location and 1. Recent work [11], [5] shows that public key algorithms are practical on the transmission range of the sensor) and the validity of thenew sensor hardware. In addition, similar to [14], we can use symmetric keycryptography instead to lower down the computation cost, at the cost of signature in the location claim. In identity-based signaturelarge communication overhead. schemes [6], only a signature generated with the private key Authorized licensed use limited to: Asha Das. Downloaded on July 29,2010 at 11:55:49 UTC from IEEE Xplore. Restrictions apply.
4.
916 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 9, NO. 7, JULY 2010corresponding to the identity claimed can pass the validation detecting node replication and high level of resilienceprocess. Thus, adversaries cannot generate valid signatures against potential node compromise), by choosing anunless they compromise the node with that identity. appropriate value for w (s and ps , actually). A detailed Each neighbor independently decides whether to for- analysis of the security and efficiency achieved in SDC isward the claim with a probability pf . If a neighbor plans to presented in Section 5.forward the location claim, it first needs to execute a 4.2 Parallel Multiple Probabilistic Cellsgeographic hash function [15] to determine the destinationcell, denoted as C. The location claim is then forwarded 4.2.1 Motivationtoward cell C. In this paper, we assume the existence of a monitoring Once the location claim arrives at cell C, the sensor mechanism that can detect a node compromising operationreceiving the claim first verifies the validity of the signature, with a certain probability. Therefore, the larger the numberand then checks whether cell C is indeed the cell of nodes that an adversary attempts to compromise, thecorresponding to the identity listed in the claim message higher is the probability that the node compromising attackbased on the geographic hash function. If both the is detected, thereby triggering an automated protocol orverifications succeed, the location claim is flooded within human intervention for removing compromised nodes.cell C. Each node in the cell independently decides whether However, in certain cases (e.g., when the number of nodesto store the claim with a probability ps . Note that the in a cell is relatively small), a determined adversary may beflooding process is executed only when the first copy of the willing to take the risk of being detected in return for a highlocation claim arrives at cell C, and the following copies are probability of controlling all the witness nodes for one orignored. As a result, the number of witnesses in the cell w is more identities.s Á ps on average, where s is the number of sensors in a cell. Another potential risk is that a smart adversary can take Whenever any witness receives a location claim with the advantage of the knowledge that the destination cell for asame identity but a different location compared to a given identity is deterministic and launch a blocking attack.previously stored claim, it forwards both location claims Informally, after compromising a small set of sensorsto the base station. Then, the base station will broadcast a denoted as V , the adversary can generate replicas ofmessage within the network to revoke the replicas. members in V and deploy them in such a way that all the Compared to the Random Multicast and Line-Selected location claims of these replicas are forwarded throughMulticast algorithms, a major advantage of SDC is that it members of V . In the SDC approach, all the location claims are firstensures 100 percent success rate for detecting any node forwarded from the neighbors of L to a deterministic cell.replication, as long as the location claim is successfully Therefore, there is a high probability that these forwardingforwarded toward cell C and stored by at least one node in paths intersect with each other. In particular, when L andthe cell. the destination cell (i.e., cell C) are far from each other, there An important limitation on the Random Multicast and is a high probability that all the location claims will passLine-Selected Multicast algorithms is that both the commu- through one or a small set of nodes of size y. Therefore, thenication/memory overhead and the security (in terms of the adversary only needs to compromise one or y nodes persuccess rate of detecting node replications) of the two replica so as to block the forwarding of a location claim.algorithms are tightly related to the number of witnesses Hop-by-hop watchdog monitoring [12] may help mitigate(w). On the one hand, the larger w is, the higher the this attack. However, it will fail if all or most of thecommunication and memory overhead. On the other hand, neighbors of an intersection point are compromised.the smaller w is, the lower the success rate of detecting node Even worse, the adversary can insert a replica in such areplication. To ensure a high success rate of detecting node way that its location claim will always be forwarded through pﬃﬃﬃreplication, w has to be Oð nÞ, where n is the number of a small set of compromised nodes. An example of blockingsensors in the network. attack against the SDC approach is shown in Fig. 1. Cell C1 In contrast, in the SDC scheme the communication cost and C2 are the deterministic cells for the identity IDC1 andand memory overhead are related to the number of IDC2 , respectively, and B is an area in which all the nodesneighbors that forward a location claim (i.e., r ¼ d Á pf ) have been compromised (referred to as a black hole). In thisand the number of the witnesses (i.e., w ¼ s Á ps ), respec- example, three replicas (i.e., L1 1 , L2 1 , and L3 1 ) claiming the C C Ctively. In addition, the success rate of detecting node same identity that is mapped to cell C1 are added to thereplication is independent of w when w ! 1. Moreover, the network sequentially, with a certain time interval betweenrandomization against potential node compromise and low any pair of consecutive joins. In the SDC approach, nodesmemory overhead are achieved through flooding the enroute between the replica and the deterministic cell do notlocation claim within the destination cell while storing iton only a small number of randomly chosen nodes. store the location claim. As a result, as long as the locationAssuming that the capability of the adversary (in terms of claims from different replicas do not arrive at the same time,the number of nodes that can be compromised without forwarding nodes are not able to detect the conflicts. Finally,being detected) is limited, by appropriately choosing the all the location claims are delivered to the black hole andcell size (s) and ps , the probability that adversaries control blocked. In other words, adversaries can insert replicasall the witnesses for an identity is negligible. Consequently, without being detected. Note that the same black hole may beSDC can achieve a low communication cost by setting r to a used to insert replicas for multiple identities. As shown insmall value, and at the same time ensure low memory Fig. 1, two replicas (i.e., L1 2 and L2 2 ) claiming the same C Coverhead and good security (i.e., a high success rate of identity that is mapped to cell C2 are inserted into the Authorized licensed use limited to: Asha Das. Downloaded on July 29,2010 at 11:55:49 UTC from IEEE Xplore. Restrictions apply.
5.
ZHU ET AL.: LOCALIZED MULTICAST: EFFICIENT AND DISTRIBUTED REPLICA DETECTION IN LARGE-SCALE SENSOR NETWORKS 917Fig. 1. The blocking attacks. Fig. 2. The parallel multiple probabilistic cells approach.network and their location claims are also blocked by the location claim. If both the verifications succeed, the claim isblack hole B. flooded within the cell and probabilistically stored at w nodes in the same manner as in the SDC scheme.4.2.2 Description of the P-MPC Scheme For example, in Fig. 2, there are two replicas with theLike SDC, in the P-MPC scheme, a geographic hash same identity in the network. In this example, an identity isfunction [15] is employed to map node L’s identity to the mapped to three cells (i.e., C1 ; C2 ; C3 ) with differentdestination cells. However, instead of mapping to a single probabilities (i.e., pc1 > pc2 > pc3 ). The neighbors of onedeterministic cell, in P-MPC, the location claim is mapped replica forward the location claims to cell C1 and C2 , whileand forwarded to multiple deterministic cells with various the neighbors of the other replica forward the locationprobabilities. claims to cell C1 and C3 . Therefore, any witness node with Let C ¼ fC1 ; C2 ; . . . ; Ci ; . . . ; Cv g denote the set of cells to cell C1 can detect the node replication.which an identity (denoted as IDL ) is mapped. Let pcidenote the probability that the location claim of L isforwarded to cell Ci . Without loss of generality, in the rest 5 ANALYSIS OF THE SINGLE DETERMINISTIC CELLof this paper, we assume that set C is sorted by pci s. The SCHEMEfollowing two conditions should be satisfied while deter- P In this section, we analyze the security and efficiency of themining pci s: 1) v pci ¼ 1 and 2) pci ! pcj when i < j for i¼1 Single Deterministic Cell scheme.i; j 2 f1; 2; . . . ; vg. The second condition is introduced toenhance the efficiency of the protocol as described later in 5.1 Security AnalysisSection 6.2. An example of P-MPC is shown in Fig. 2. The metrics used to evaluate the security of the SDC When L broadcasts its location claim, each neighbor scheme are:independently decides whether to forward the claim in thesame way as the SDC scheme. Afterwards, each neighbor 1. the probability of detecting node replication whenhelping forward the claim first calculates the set of cells (i.e., adversaries put x replicas (including the compro-C) to which L are mapped, based on a geographic hash mised node) with the same identity into the network,function with the input of IDL . For example, by using a one- which is denoted as pdr .way hash function HðÞ, node L is mapped to the set of cells 2. the probability that adversaries control all theC ¼ fC1 ;C2 ; . . . ; Ci ; . . . ;Cv g, where Ci ¼ ½HðIDL kiÞmodðaÁbÞþ witnesses for a given identity after compromising t1ði 2 f1; 2; . . . ; vgÞ. Then, each neighbor that forwards the nodes, which is denoted as pts .claim independently generates a random number z 2 ½0; 1Þ. 3. the probability that adversaries control all theAssume that j is the smallest number that satisfies witnesses for at least one identity after compromis- P ing t nodes, which is denoted as ptm .z < j pci ðj 2 f1; 2; . . . ; vgÞ, this neighbor chooses the i¼1jth cell (i.e., Cj ) as the destination cell for the location claim. The latter two metrics estimate the risk that an adversaryFor example, if z ¼ 0:8 and the predetermined distribution of controls all the witnesses for a node and can thus launch apci ’s is (pc1 ¼ 50%, pc2 ¼ 25%, pc3 ¼ 15%, and pc4 ¼ 10%), the node replication attack without being detected.claim will be forwarded to cell C3 . Same as [14], for the theoretical analysis in Section 5 Once the location claim arrives at cell Cj , the sensor and Section 6, we assume that there are r (¼ d Á pf )receiving it first verifies whether Cj is a member of C which neighbors forwarding L’s location claim. Also, we assumecan be calculated based on the geographic hash function that there are w (¼ s Á ps ) witnesses per destination celland the identity listed in the claim message. In addition, this storing L’s location claim. Since 1 ! pf > 0 and 1 ! ps > 0,sensor needs to verify the validity of the signature in the we have r > 0 and w > 0. Authorized licensed use limited to: Asha Das. Downloaded on July 29,2010 at 11:55:49 UTC from IEEE Xplore. Restrictions apply.
6.
918 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 9, NO. 7, JULY 2010Fig. 3. Probability that adversaries control all w witnesses for a given Fig. 4. Probability that adversaries control all w witnesses for any identityidentity after compromising t nodes (pts ). after compromising t nodes (ptm ).5.1.1 Detecting Replicas t ¼ 60, pts is only 7:82 Â 10À6 . Even if w is chosen as a relativeUnlike the Random Multicast and Line-Selected Multicast small number, e.g., 5, the adversary still needs to compro-algorithms [14] where the nodes storing the copies of a mise around 65 out of 100 nodes in the cell to achieve alocation claim are chosen randomly from the whole success rate of nearly 11 percent.network, in SDC such nodes are chosen randomly from a However, in practice, the probability that the adversarysmall subset of all the nodes in the network, i.e., the nodes controls all the witnesses for at least one identity (i.e., ptm )in the destination cell determined by the geographic hash might be a more accurate and strict measure of the security offunction. In addition, since the location claim will be the scheme. In order to calculate ptm , we begin by estimatingflooded within the destination cell, the SDC scheme can the probability that all the w copies of any location claim arealways detect any pair of nodes claiming the same identity. stored within a given set T of t compromised nodes, which isIn other words, pdr ¼ 100% in SDC, when r > 0 and w > 0. denoted as pts2 . Given that the members of T and the nodes storing any location claim are chosen randomly from all the5.1.2 Resilience against Node Compromise nodes in the cell, we haveAssuming that the adversary’s capability of compromising ÀtÁ ðt À w þ 1Þðt À w þ 2Þ Á Á Á tnodes is limited, in SDC the probability that an adversary pts2 ¼ ÀwÁ ¼ s ¼ pts ; ð2Þcan compromise all the witness nodes storing the location w ðs À w þ 1Þðs À w þ 2Þ Á Á Á sclaim of a given identity (i.e., pts ) is higher than that in the where t ! w.Randomized Multicast algorithm, because witness nodes in In SDC, there are on average s different location claimsthe former are chosen from a smaller set compared to the stored within a cell. Since the nodes storing the copies oflatter. However, we argue that by appropriately choosing different location claims are chosen independently, thethe parameters, e.g., the network size (s) and probability process of selecting the witnesses for the s location claims,that a sensor in the cell stores the location claim (ps ), we can all of which are members of T , is equivalent to a Bernoullilimit pts to a very small value, even if the adversaries can trial in which s trials are made, with probability pts2 ofcompromise a small fraction of the nodes in cell C. success in any given trial. Let nt denote the number of Assuming that the adversary has compromised t nodes identities for which all the copies of the correspondingin cell C, pts can be calculated as follows: location claims are stored within the set T . In other words, ÀsÀwÁ all the witness nodes for these nt identities are controlled by ðt À w þ 1Þðt À w þ 2Þ Á Á Á t pts ¼ tÀw ¼ ÀsÁ ; ð1Þ the adversary. As a result, the expectation of nt and ptm can t ðs À w þ 1Þðs À w þ 2Þ Á Á Á s be calculated according to (3) and (4), respectively. ÀÁwhere t ! w. In (1), s denotes the number of all possible t Eðnt Þ ¼ s Á pts2 ; ð3Þcombinations of compromising t sensors in a cell of size s, ÀsÀwÁand tÀw denotes the number of all possible combinations ptm ¼ 1 À ð1 À pts2 Þs : ð4Þthat all w witnesses for a given identity are compromised,which is equivalent to the number of combinations that t À w In Fig. 4, we plot the probability that adversaries controlout of t compromised sensors are chosen from s À w sensors all the witness nodes for at least one identity (i.e., ptm ) underthat are not the witnesses for this identity. different settings, when s ¼ 100. We notice that ptm is much In Fig. 3, we plot the probability that an adversary controls higher than pts , especially when w is small. For example,all the witness nodes of a given identity (i.e., pts ) under when the average number of the witness nodes for adifferent settings, when the cell size is 100 (i.e., s ¼ 100). Fig. 3 location claim (i.e., w) and the number of nodes controlledshows that when w (in fact s and ps ) is chosen appropriately, by the adversary in the cell (i.e., t) are 5 and 30, respectively,pts is negligible, even if the adversary can compromise a large ptm is 17.26 percent, which is much higher than pts , i.e.,number of nodes in the cell. In particular, when w ¼ 20 and 0.19 percent. As such, it might be necessary to set w to a Authorized licensed use limited to: Asha Das. Downloaded on July 29,2010 at 11:55:49 UTC from IEEE Xplore. Restrictions apply.
7.
ZHU ET AL.: LOCALIZED MULTICAST: EFFICIENT AND DISTRIBUTED REPLICA DETECTION IN LARGE-SCALE SENSOR NETWORKS 919larger number, such as 10-15, which corresponds to the TABLE 2situation that, even if the adversary compromises 43-58 out Detection Rates When There Are 2 or 3 Nodesof 100 nodes in a cell, the probability that she launches a with the Same Identity, Given Different Settingsnode replication attack without being detected is less than of the Distribution of Forwarding Probabilities1 percent, i.e., ptm < 1%.5.2 Efficiency AnalysisThe metrics used to evaluate the efficiency of the SDCscheme include: 1. The average number of packets sent and received node replication attack is not detected by our scheme after while propagating the location claim, which is the ith node with the same identity has been added to the denoted as nf . 2. The average number of copies of the location claims network. For analyzing the security of the P-MPC scheme, we stored on a sensor, which is denoted as ns . use the same metrics employed in Section 5.1, except that we replace the metric pdr with pir . The former is to measure the communication cost, whilethe latter is to estimate the memory overhead. We do not 6.1.1 Detecting Replicasexplicitly consider the computation cost (i.e., verifying that Let Cs1 denote the set of all combinations of choosing 1 tothe location claim is generated by an entity which holds the v À 1 elements from C, i.e., the set of cells to which IDL isprivate key corresponding to the identity listed in the mapped. If the node replication attack is not detected whenclaim), since every forwarding node needs to execute such averification and thus it is proportional to the communica- the adversary adds replica l2 to the network, it implies thattion cost. In other words, the higher the communication the location claims for l2 have been forwarded to a set ofcost, the higher the computation cost. cells, none of which contains any node storing a location claim from l1 .5.2.1 Communication Cost Let Ce1 denote a subset of the cells in C that do not storeThe communication cost of the SDC scheme has two the location claims of l1 . Let pi;1 denote the probability thatcomponents: the cost of forwarding the location claim to the location claim of l1 is forwarded to all the cells in Cthe destination cell (denoted as COfw ) and the cost of except the cells in Ce1 , which is an element of Cs1 . Let pi;2flooding the location claim within the destination cell denote the probability that the location claim of l2 is(denoted as COfl ). The communication complexities of these forwarded to any cell(s) in Ce1 . Therefore, we have: pﬃﬃﬃtwo operations are Oðd Á pf Á nÞ and OðsÞ, respectively. jCs1 j X5.2.2 Memory Overhead p2r ¼ pi;1 Á pi;2 : ð5Þ i¼1SDC has the memory overhead of OðwÞ, where w ¼ s Á ps . Asshown in Section 5.1, a relative small value of w, e.g., Now, we consider further the case that the adversarybetween 10 to 15 when s ¼ 100, is sufficient to ensure adds l3 to the network. Let Cs1b denote the set of all thesecurity against node compromise. Therefore, the memory combinations of choosing 2 to v À 1 elements from C. For aoverhead of the SDC scheme is significantly lower than those given Ce1 2 Cs1b , let Cs2 denote all the combinations ofof the Random Multicast algorithm and the Line-Selected choosing 1 to jCe1 j À 1 elements from Ce1 . We denote Ce2 as pﬃﬃﬃMulticast algorithm which are of order Oð nÞ or higher.2 the set of cells that store the location claim from l2 but not l1 , and Ce2 2 Cs2 . Let pi denote the probability that the location6 ANALYSIS OF THE PARALLEL MULTIPLE claim of l1 is forwarded to all the cells in C except the cells in Ce1 , which is an element of Cs1b . Let pij;1 denote the PROBABILISTIC CELLS SCHEME probability that the location claim of l2 is forwarded only toIn this section, we analyze the security and efficiency of all the cells in Ce2 . Let pij;2 denote the probability that thethe P-MPC scheme. In addition, a summary of the location claim of l3 is forwarded to any cell(s) in Ce1 exceptcommunication cost and memory overhead of our those in Ce2 . Thus, we have:approach and the algorithms proposed in [14] is shownat the end of this section. jCs1b j X X jCs2 j p3r ¼ pi Á pij;1 Á pij;2 : ð6Þ6.1 Security Analysis i¼1 j¼1For simplicity, in this section we assume that the number of Let r ¼ 3 and v ¼ 3. In Table 2, we show the estimatedneighbors (r) forwarding the location claim is a fixed number. success rate of detecting node replications under differentWe assume that the adversary creates x À 1 replicas of a settings of pci according to (5) and (6). According to Table 2given compromised node with id IDL and deploys them inthe network. We assume that adversaries do not reposition (where “Set.” is a short notation for “Setting”), the P-MPCthe compromised node, l1 , and the replicas are added in scheme can achieve a very high replica detection rate, evensequence from l2 to lx . Let pir denote the probability that the when an identity is mapped to three destination cells. Moreover, we notice that the larger the differences between 2. Refer to Section 6.3 for the more detailed comparison. the probabilities pci s, the higher is pir . Authorized licensed use limited to: Asha Das. Downloaded on July 29,2010 at 11:55:49 UTC from IEEE Xplore. Restrictions apply.
8.
920 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 9, NO. 7, JULY 2010 TABLE 3 TABLE 5 Probability that the Adversary Controls All w Witnesses Settings of the Distribution of Forwarding Probabilities for a Given Identity after Compromising tÁ Nodes in a Cell of Size s in the P-MPC Scheme (s ¼ 100, w ¼ 5, tÁ ¼ 30) multiple cells in C. Assuming that the adversary selects this optimal strategy, the larger the differences between pci s, the6.1.2 Resilience against Node Compromise larger is pP ÀMP C and thus the weaker the resilience of the tsLet pSDC ðtÞ and pP ÀMP C ðtÞ denote the functions that output ts ts scheme to node compromise.the pts of the SDC scheme and the P-MPC scheme, Compared to SDC, P-MPC is more robust to noderespectively, when the number of the compromised nodes compromise. Assuming that adversaries follow the bestis t. Let pSDC ðtÞ and pP ÀMP C ðtÞ denote the functions that tm tm strategy just described, i.e., compromising only nodes in theoutput the ptm of the SDC scheme and the P-MPC scheme, cell with the highest pci , (7) can be converted into:respectively, when the number of the compromised nodesis t. Assuming that the adversary’s capabilityP compro- of pP ÀMP C ðtÞ ¼ pr Á pSDC ðtÞ: ts c1 ts ð8Þ vmising nodes is bounded by tÁ , we have i¼1 ti ¼ tÁ , As a result, compared to the SDC approach, the success ratewhere ti is the number of nodes compromised in cell Ci . that adversaries control all the witnesses of a given identity Let Ct1 denote the set of all the combinations of choosing is reduced by a factor of 1 À pr . c11 to v elements from C. For any element in Ct1 denoted as Unlike the SDC scheme where each identity is mappedCf1 , the probability that the adversary controls all the to only one cell, in P-MPC, each identity may be mapped towitnesses of a given identity, when such a set of cells in C multiple cells. Since the cells for a given identity are(i.e., Cf1 ) are chosen as the destination cell(s), is the product determined by geographic hash functions, those cells areof all the individual probabilities pts s of the cells. Let pi uniformly distributed. Therefore, on average for each cell,denote the probability that exactly the cells in Cf1 are there are s identities choosing it with the probabilitychosen as the destination cells by the r neighbors thatforward the location claim. Let pSDC ðtj Þ denote the pts of the pc1 ; pc2 ; . . . ; pcv , respectively. Assuming that instead of tsjth cell of Cf1 when the number of nodes compromised in spreading the limited capability of compromising nodesthis cell is tj . Thus, pP ÀMP C ðtÞ can be calculated as follows: in multiple cells, adversaries only compromise the nodes in ts P ÀMP a given cell, we can calculate ptm C ðtÞ via (9). jCt1 j jCf1 j ! X Y P ÀMP C pts ðtÞ ¼ pi Á SDC pts ðtj Þ : ð7Þ X v pP ÀMP C ðtÞ ¼ tm pr Á pSDC ðtÞ ci tm ð9Þ i¼1 j¼1 i¼1Note that in (7), jCt1 j denotes the number of all the When the differences between pci s are high, e.g.,combinations of choosing 1 to v elements from C, while Setting I in Table 5, pP ÀMP C ðtÞ can be approximated as tmjCf1 j denotes the number of cells contained in a chosen pr Á pSDC ðtÞ. In such cases, compared to the SDC scheme,combination, i.e., Cf1 . In additional, pSDC ðtj Þ ¼ 1 when ts c1 tm the success rate that adversaries control all the witnessesthere is no witness in the jth cell of Cf1 . Let r ¼ 3 and v ¼ 3. In Table 3, we show the estimated for at least one identity is reduced by a factor of 1 À pr c1success rate that adversaries control all the witnesses as well.under different compromising strategies (i.e., various In P-MPC, even if adversaries compromise all the nodesdistributions of ti ) and probability distributions of the in the cell to which the location claims are forwarded withdestination cells (i.e., pci ) in the P-MPC scheme, when the highest probability, i.e., pc1 , node replication can still bes ¼ 100, w ¼ 5, and tÁ ¼ 30. The settings on ti and pci are detected by witnesses in the other cells. For example,shown in Tables 4 and 5, respectively. assuming that pc1 ¼ 80% and r ¼ 3, the replica can still be From Table 3, we notice that the best strategy for detected with a probability of 1 À p3 ¼ 48:8%. c1adversaries is to compromise only nodes in the cell withthe highest pci , i.e., setting A of ti , rather than spreading 6.1.3 Denial-of-Service Attackstheir limited capability of compromising nodes among Two possible Denial-of-Service (DoS) attacks against our approach are as follows: 1) An adversary inserts a large number of fake location claims into the network so as to TABLE 4 exhaust the energy and computational resources of other Settings on the Distribution of # of Compromised Nodes nodes, who will verify the signatures included in the location claims according to the approach proposed. 2) If some of a node L’s neighbors are controlled by the adversary, instead of choosing the destination cell based on the probabilistic distribution and the geographic hash function, the adversary may forward the location claim to as many cells as possible, leading to additional communication overhead when the claim is flooded within each cell. Authorized licensed use limited to: Asha Das. Downloaded on July 29,2010 at 11:55:49 UTC from IEEE Xplore. Restrictions apply.
9.
ZHU ET AL.: LOCALIZED MULTICAST: EFFICIENT AND DISTRIBUTED REPLICA DETECTION IN LARGE-SCALE SENSOR NETWORKS 921 Pv TABLE 6 TABLE 7 i¼1 psi in Terms of Different Settings on pci s (v ¼ 3) Comparisons of Average Communication Cost and Memory Overhead For the first attack, in both SDC and MPC, any fakelocation claim would fail the verification process, and thus,will not be forwarded further. As to the latter, which is only 6.2.2 Memory Overheadapplicable to MPC, if the destination cell chosen is not an In a similar fashion, we can see that the the P memoryelement of C (i.e., the set of cells to which the given identity overhead of the P-MPC scheme is given by s Á ps Á v psi . i¼1is mapped) or a neighbor forwards the same location claim 6.3 Summaryto more than one cells, the attack would be detected by Before presenting empirical results in Section 7, in Table 7,other neighbors of L, although it requires the neighbors to we summarize the average communication cost andlisten promiscuously. To avoid detection based on signa- memory overhead per node of the two variants of theture verifications, the best strategy for this type of DoS Localized Multicast approach, together with the two multi-attack is to ignore the probabilistic distribution being used cast algorithms proposed in [14] (i.e., Randomized Multi-by P-MPC for selecting destination cells, and let different cast and Line-Selected Multicast) and the RED protocol [2].neighbors choose different destination cells in C. However, In Table 7, we denote the density of the network (i.e., theas shown by our analysis, a small number of cells (v ¼ 3) is average number of neighbors per node), the probability thatsufficient for P-MPC to provide a high level of resilience a neighbor of node L decides to forward L’s location claim,against node compromise while ensuring a very high and the number of the witness nodes storing the locationdetection rate on node replication. Therefore, the effective- claim for a given identity in our approach as d, pf , and w,ness of this attack is limited. respectively. In addition, let g denote the number of6.2 Efficiency Analysis destinations (i.e., witnesses) to which a neighbor forwards the location claim, if it decides to help, in the RandomizedWhen analyzing the efficiency of the P-MPC scheme, we Multicast and Line-Selected Multicast algorithms and thefollow the same metrics employed in Section 5.2. RED protocol. Apparently, the Random Multicast algorithm6.2.1 Communication Cost has a huge communication and memory overhead, and thus, in the following we only compare our approaches with theSimilar to the SDC scheme, the communication cost for Line-Selected Multicast algorithm and the RED protocol.P-MPC has two components: the cost for propagating thelocation claim to the cells chosen and the cost for flooding 6.3.1 Comparison with the Line-Selected Multicastthe claim within these cells, denoted as COfw and COfl , Algorithmrespectively. According to the analysis in Sections 5 and 6, we know that Assuming that in the P-MPC scheme there are on r can be set to a small value, e.g., 3, while still ensuringaverage r neighbors forwarding a location claim, the pﬃﬃﬃ higher success rate of detecting replicas. To maintain acommunication complexity of COfw is Oðr Á nÞ in P-MPC, relative high detection rate, the typical setting of g Á pf Á d inif we assume that the neighbors of L forward the location the Line-Selected Multicast algorithm is 6. Therefore, theclaim independently and do not consider further optimi- COfw of either SDC or P-MPC is smaller than thezations, e.g., a node only forwards the location claims with corresponding communication cost of the Line-Selectedthe same identity and location information once within a Multicast algorithm. However, our approach has the extracertain time interval. overhead of flooding the location claim within one or more The communication complexity of COfl in the P-MPC cells, i.e., COfl .scheme can be estimated as follows: Since there are Note that for both SDC and P-MPC, the lower bound ofr neighbors of L forwarding the location claim, the the cell size is determined by the security requirements.probability that any cell in C (i.e., Ci ) is chosen by at least Once the cell size and the flooding algorithm within the cellone out of r neighbors is: are chosen, COfl is fixed and independent of the network psi ¼ 1 À ð1 À pci Þr : size. According to Table 7, we know that the extra overheads of the Random Multicast and the Line-SelectedTherefore, the complexity of COfl in the P-MPC scheme can Multicast algorithms over COfw of our approach can be P pﬃﬃﬃbe described as Oðs Á v psi Þ. Table 6 shows the value of described as ð n À rÞ Á S and ðg Á pf Á d À rÞ Á S, respectively,Pv i¼1 i¼1 psi in terms of different settings on pci s when v ¼ 3. where S denotes the average communication cost ofAccording to Table 6, the larger the differences between forwarding a packet between a randomly chosen pair ofpci s, the smaller the extra overhead of flooding the location nodes in the network. S is tightly related to the network size pﬃﬃﬃclaim, when compared to the SDC scheme. (i.e., the complexity of S is Oð nÞ) and the network Authorized licensed use limited to: Asha Das. Downloaded on July 29,2010 at 11:55:49 UTC from IEEE Xplore. Restrictions apply.
10.
922 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 9, NO. 7, JULY 2010topology (i.e., for the same network size, S under an 7.2 System and Network Modelsirregular topology is higher than that under a regular As in the Parno et al.’s study, we considered both uniformuniform topology). Consequently, our schemes are more and irregular network topologies. In the uniform topology,scalable and less sensitive to irregular topologies, when nodes are randomly distributed within a 500 Â 500 square.compared to the two algorithms proposed in [14]. The network size (n) varies between 1,000 to 10,000. We The analysis in Sections 5.1 and 6.1 shows that it is assume a bidirectional communication model, and adjustsufficient to choose a small value for w to resist node the transmission range so that the average number ofcompromise, and thus, our approaches provide far better neighbors of a sensor (d) is 40. We also considered sixmemory efficiency, compared to the Randomized Multicast irregular topologies (as shown in Fig. 5), i.e., “Thin H,”and Line-Selected Multicast algorithms, especially when the “Thin Cross,” “Thin 2,” “Large Cross,” “L,” and “Large H”network size is very large. with the same density, i.e., d ¼ 40. As in Parno et al.’s study [14], these topologies are generated as the subregions of the6.3.2 Comparison with the RED Protocol regular topology (n ¼ 10;000).In Table 7, the complexity of the communication overhead The Localized Multicast approach assumes that theof the RED protocol is the same as that of the Line-Selected network is divided into cells, and that a location claim isMulticast algorithm. However, theoretically, it is fine to set flooded within the destination cell(s). There has beeng and pf in such a way that g Á pf Á d is smaller than the extensive work in optimal/efficient flooding [10], [9], [8],typical setting in the Line-Selected Multicast algorithm, [16], and our approaches can be easily integrated with anysince as long as there is at least one neighbor forwards the efficient flooding algorithm. In all our simulations, we usedlocation claim and assume that there is no communication the following simple algorithm, unless specified otherwise.loss, the RED protocol can detect the replicas. In this sense, Let Ncell ¼ k2 denote the number of cells in the network. Let l denote the length of the side of the network. The size of athe communication overhead of RED is the same as COfw in cell is selected in such a way that one broadcast can coverSDC, if pf Á d ¼ r and g ¼ 1, but without the extra overhead most of the area of a cell. Thus, we haveof flooding with one or a few cells. Nevertheless, in practice, 0 1due to the communication loss and the routing errors, we rﬃﬃﬃﬃﬃﬃshould set g Á pf Á d to a higher value to ensure a certain level l B l C n k ¼ round pﬃﬃﬃ ¼ round@pﬃﬃﬃ qﬃﬃﬃﬃﬃﬃA ¼ round ;of detection rate. For example, due to this reason, in our 2ÁR dÁl2 2 Á Án 2dsimulation when pf ¼ 3=d the actual detection rate of SDC isslightly lower than that of P-MPC. Consequently, since where R is the communication range of a node and roundðÞCOfl is fixed and independent of the network size, the is a function that rounds the input to the nearest integer. Forcommunication overhead of SDC and P-MPC will only nodes not covered by the broadcast, further unicasts areslightly higher than that of RED, in particular when the required to deliver the location claim.network size is large. For the Random Multicast and Line-Selected Multicast The memory overheads of SDC and RED are w and algorithms, we use the same settings as in [14], except for pfg Á pf Á d, respectively. Both of them are small numbers, e.g., 2 in the Line-Selected Multicast algorithm. More specifically,to 5, and thus, the memory overhead of these two algorithms for the former, we set the number of sensors storing a given pﬃﬃﬃ pﬃﬃﬃare comparable. The memory overhead of P-MPC is slightly location claim to n, i.e., w ¼ n; for the latter, we set thehigher than that of SDC or RED. number of lines as 6 (i.e., f ¼ 6) in the comparison. As to pf in the Line-Selected Multicast algorithm, it is set to 1=d in [14], and each forwarding node randomly picks f destina-7 EVALUATION tions. In our simulation, we set pf ¼ f=d, and eachWe evaluated the performance and security of our schemes forwarding node randomly picks only one destination.and those proposed by Parno et al. via extensive simula- Given the same density, our setting on f results in a lowertions. To enable a fair comparison, we used the same probability that there is no neighbor forwarding the locationsimulation methodology and simulation code that was used claim. As a result, compared to [14], in our simulation the Line-Selected Multicast algorithm has a higher success ratein Parno et al.’s study [14]. In addition, we also investigated of detecting node replication, as shown in Section 7.3.2.security and efficiency of our approach under different For both SDC and P-MPC, we set pf ¼ 3=d and ps ¼ 0:2.settings, such as different probabilities of forwarding Besides that, for P-MPC, we use Setting I in Table 5 as thelocation claims. setting of pci s in the simulation. Namely, v ¼ 3, and pc1 , pc2 ,7.1 Metrics and pc3 are 80 percent, 15 percent, and 5 percent, respectively.We used the following metrics to compare the schemes: 7.3 Comparisons with Parno et al.’s Work . Communication overhead: We measured the total 7.3.1 Communication Overhead number of packets sent and received per node for The figures below show the 95 percent confidence intervals running the replica detection algorithm when n of the reported metric. In Fig. 6, we compare the commu- nodes are added to the network. We denote this nication costs of our two schemes with the two algorithms metric as nf . proposed in [14] for uniform topologies. As shown in Fig. 6, . Success rate in detecting replicas: We measured the the Random Multicast algorithm has the highest commu- probability of detecting a replica, when there are two nication costs under all the settings. Among the remaining sensors with the same identity in the network, i.e., p2r . schemes, SDC has the lowest communication overhead, Authorized licensed use limited to: Asha Das. Downloaded on July 29,2010 at 11:55:49 UTC from IEEE Xplore. Restrictions apply.
11.
ZHU ET AL.: LOCALIZED MULTICAST: EFFICIENT AND DISTRIBUTED REPLICA DETECTION IN LARGE-SCALE SENSOR NETWORKS 923Fig. 5. Irregular topologies: (a) Thin H, (b) Thin Cross, (c) Thin 2, (d) Large Cross, (e) L, and (f) Large H.though the differences between SDC, P-MPC, and Line- Line-Selected Multicast algorithm is much higher than thatSelected Multicast are relatively small. As the network size under the regular topology (n ¼ 10;000). More specifically,increases, P-MPC and SDC have lower overhead than Line- under these four topologies, SDC’s and P-MPC’s advantageSelected Multicast. Fig. 6 shows that SDC and P-MPC have over the Line-Selected Multicast algorithm (in terms of thelower communication overheads than Line-Selected Multi- communication cost) is 149 percent to 181 percent andcast when n ! 2;000 and n ! 4;000, respectively. 238 percent to 296 percent, respectively, higher than that In Fig. 7, we compare the communication costs of our two under the regular topology (n ¼ 10;000).schemes with the two algorithms proposed in [14] for 7.3.2 Replica Detection Success Rateirregular topologies. In comparison to Line-Selected Multi-cast, both SDC and P-MPC show much stronger adaptability Due to the high cost of the Random Multicast algorithm, we only consider SDC, P-MPC, and the Line-Selected Multicastfor irregular network topologies. Under all the irregular algorithm while comparing the success rates of detectingtopologies, the nf s of our two schemes are smaller than that node replication.of the Line-Selected Multicast algorithm. In particular, Fig. 8 shows that, compared to the Line-Selectedunder the “Thin H”, “Thin Cross”, “Thin 2”, and “Large Multicast algorithm, both of our algorithms have muchH” topologies, the advantage of our two schemes over the higher success rates of detecting node replication. MoreFig. 6. Communication overhead of SDC, P-MPC, Random Multicast, Fig. 7. Communication overhead of SDC, P-MPC, and Line-Selectedand Line-Selected Multicast for uniform topologies. Multicast for irregular network topologies. Authorized licensed use limited to: Asha Das. Downloaded on July 29,2010 at 11:55:49 UTC from IEEE Xplore. Restrictions apply.
12.
924 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 9, NO. 7, JULY 2010Fig. 8. Success rate of detecting replicas in SDC, P-MPC, and Line- Fig. 10. Communication overhead of Line-Selected Multicast withSelected Multicast. different numbers of lines.specifically, on average, the success rates of SDC and probability of detecting replicas increases by on averageP-MPC in detecting node replication are 25.64 percent and 13.62 percent. However, as a trade-off, the communication21.77 percent higher than that of the Line-Selected Multi- cost increases by on average 28.22 percent at the same time,cast algorithm, respectively. as shown in Fig. 10. We notice that, however, the success rates of SDC rangefrom 89.4 percent to 94.5 percent, which are lower than the 7.4 Evaluation of Our Approach under Differentexpected value (i.e., 100 percent) according to the theoretical Settingsanalysis in Section 5.1. It is due to two reasons. The main In the following, we present the results of simulations thatreason is that, each neighbor decides whether to forward aim at evaluating both security and efficiency of ourthe location claim independently, and thus, there exists a approach under different settings. According to theprobability that no neighbor forwards the location claim. As collected results, when a parameter is changed, SDC anda result, SDC fails to detect a node replication attack, if for P-MPC share the same trend, i.e., increase, decrease, orany of the two replicas no neighbor forwards its location unchange. Due to the limit of space, we only present theclaim. In addition, when ps is too small, there exists a results related to SDC.probability that no node within cell C stores the locationclaim, which may also result in SDC’s failure in detecting 7.4.1 Different Settings on the Probability of Forwardingnode replication. In the simulation, we set ps ¼ 0:2, and Location Claimsthus, the second reason only has a negligible effect. Similar to the impact of f on the Line-Selected Multicast Due to the same reason, the simulation results about algorithm, as shown in Figs. 11 and 12, in SDC a higherP-MPC’s success rates of detecting node replication are probability of forwarding location claims (i.e., pf ) canlower than the expected value according to the theoretical improve the probability of detecting replicas, while at theanalysis in Section 6.1. same time the communication overhead is raised. In the Line-Selected Multicast algorithm, the probabilityof detecting replicas can be improved by increase the 7.4.2 Different Settings on the Probability of Storingnumber of lines involved in forwarding a location claim, Location Claimsi.e., f. Fig. 9 shows that, a large value of f leads to a higher Intuitively, when the probability of storing location claimsp2r . For example, when f increases from 6 to 8, the (i.e., ps ) increases, there are less chances that no sensor in aFig. 9. Success rate of detecting replicas in Line-Selected Multicast with Fig. 11. Success rate of detecting replicas in SDC with different settingsdifferent numbers of lines. on pf . Authorized licensed use limited to: Asha Das. Downloaded on July 29,2010 at 11:55:49 UTC from IEEE Xplore. Restrictions apply.
13.
ZHU ET AL.: LOCALIZED MULTICAST: EFFICIENT AND DISTRIBUTED REPLICA DETECTION IN LARGE-SCALE SENSOR NETWORKS 925Fig. 12. Communication overhead of SDC with different settings on pf . Fig. 14. Success rate of detecting replicas in SDC with different cell sizes.cell stores a location claim forwarded to this cell. As aresult, as shown in Fig. 13 the detection rate is improved at 8 CONCLUSION AND FUTURE WORKthe cost of a higher memory overhead. As to the commu- In this paper, we proposed two variants of the Localizednication overhead, it is unchanged, since the same flooding Multicast approach for distributed detection of nodeoverhead applies in spite of the value of ps , as long as the replication attacks in wireless sensor networks. Unlike thelocation claim arrives the mapped cell. two randomized algorithms proposed by Parno et al. [14],7.4.3 Different Settings on the Cell Size our approach combines deterministic mapping (to reduce communication and storage costs) with randomization (toTo evaluate the influence of the cell size on our approach, we increase the level of resilience to node compromise). Ourtested three settings on the cell size, i.e., s1 , 2s1 , and 4s1 , where theoretical analysis and empirical results show that,s1 is the default cell size when the network is partitioned into compared to Parno et al.’s algorithms, our schemes arecells through the method described in Section 7.2. more efficient in large-scale sensor networks, in terms of Apparently, given the same probability of storing communication and memory costs. Moreover, the prob-location claims, the larger the cell size, the larger the ability of replica detection in our approach is higher thanaverage number of witnesses per location claim, and thus, that achieved in these two algorithms.the less the probability that there is no witness for a location Our preliminary analysis also shows that, our ap-claim flooding the cell, which results in a higher detection proaches are more robust than RED against selective noderate. This observation is confirmed by our results. Fig. 14 compromise, and the communication and memory over-shows that p2r increases when raising the cell size. heads of our approaches are similar or slightly higher than In terms of communication overhead as shown in Fig. 15, that of RED. One of our future work is to simulate the REDalthough the cell size is increased by 100 percent and protocol and then have a more detailed comparison of300 percent, respectively, the overall communication costper node increases by only 12.2 percent and 24.6 percent, efficiency based on empirical results.respectively. It is mainly due to the fact that, the floodingcost is a relatively small portion of the overall communica- ACKNOWLEDGMENTStion cost. Moreover, when the cell size increases, theaverage number of hops between a neighbor forwarding A preliminary version of this article appeared in the 2007the location claim and the mapped cell will decrease. In Proceedings of the Annual Computer Security Applicationsother words, the forwarding cost is reduced. Conference. This work is based on previous work at Center for Secure Information Systems, George Mason University.Fig. 13. Success rate of detecting replicas in SDC with different settingson ps . Fig. 15. Communication overhead of SDC with different cell sizes. Authorized licensed use limited to: Asha Das. Downloaded on July 29,2010 at 11:55:49 UTC from IEEE Xplore. Restrictions apply.
Be the first to comment