Successfully reported this slideshow.

2013 2014 ieee projects titles with abstract


Published on

Typical Soft Technologies is one of the leading Software Company located in Chennai that offers best Quality projects and training to all customers.

We also deliver all the finest Projects at many Students, Companies. Our Computer Courses Provides best future for the Students. If you want any of our projects & Courses, then contact us.
044-43555140, 093443 99926.

Published in: Education, Technology, Business
  • Be the first to comment

2013 2014 ieee projects titles with abstract

  1. 1. TYPICAL SOFT TECHNOLOGIES MOBILE COMPUTING 1. ALERT: An Anonymous Location-Based Efficient Routing Protocol in MANETs Abstract : Mobile Ad Hoc Networks (MANETs) use anonymous routing protocols that hide node identities and/or routes from outside observers in order to provide anonymity protection. However, existing anonymous routing protocols relying on either hop-by-hop encryption or redundant traffic, either generate high cost or cannot provide full anonymity protection to data sources, destinations, and routes. The high cost exacerbates the inherent resource constraint problem in MANETs especially in multimedia wireless applications. To offer high anonymity protection at a low cost, we propose an Anonymous Location-based Efficient Routing proTocol (ALERT). ALERT dynamically partitions the network field into zones and randomly chooses nodes in zones as intermediate relay nodes, which form a nontraceable anonymous route. In addition, it hides the data initiator/receiver among many initiators/receivers to strengthen source and destination anonymity protection. Thus, ALERT offers anonymity protection to sources, destinations, and routes. It also has strategies to effectively counter intersection and timing attacks. We theoretically analyze ALERT in terms of anonymity and efficiency. Experimental results exhibit consistency with the theoretical analysis, and show that ALERT achieves better route anonymity protection and lower cost compared to other anonymous routing protocols. Also, ALERT achieves comparable routing efficiency to the GPSR geographical routing protocol. Contact : 044-43555140 9344399918/26 Page 1
  2. 2. TYPICAL SOFT TECHNOLOGIES 2. DSS: Distributed SINR-Based Scheduling Algorithm for Multihop Wireless Networks Abstract : The problem of developing distributed scheduling algorithms for high throughput in multihop wireless networks has been extensively studied in recent years. The design of a distributed low-complexity scheduling algorithm becomes even more challenging when taking into account a physical interference model, which requires the SINR at a receiver to be checked when making scheduling decisions. To do so, we need to check whether a transmission failure is caused by interference due to simultaneous transmissions from distant nodes. In this paper, we propose a scheduling algorithm under a physical interference model, which is amenable to distributed implementation with 802.11 CSMA technologies. The proposed scheduling algorithm is shown to achieve throughput optimality. We present two variations of the algorithm to enhance the delay performance and to reduce the control overhead, respectively, while retaining throughput optimality. 3. Toward Accurate Mobile Sensor Network Localization in Noisy Environments Abstract : The node localization problem in mobile sensor networks has received significant attention. Recently, particle filters adapted from robotics have produced good localization accuracies in conventional settings. In spite of these successes, state-of-theart solutions suffer significantly when used in challenging indoor and mobile environments characterized by a high degree of radio signal irregularity. New solutions are needed to address these challenges. We propose a fuzzy logic-based approach for mobile node localization in challenging environments. Localization is formulated as a fuzzy multilateration problem. For sparse networks with few available anchors, we propose a fuzzy grid-prediction scheme. The fuzzy logic-based localization scheme is implemented in a simulator and compared to state-of-the-art solutions. Contact : 044-43555140 9344399918/26 Page 2
  3. 3. TYPICAL SOFT TECHNOLOGIES Extensive simulation results demonstrate improvements in the localization accuracy from 20 to 40 percent when the radio irregularity is high. A hardware implementation running on Epic motes and transported by iRobot mobile hosts confirms simulation results and extends them to the real world. 4. Adaptive Duty Cycle Control with Queue Management in Wireless Sensor Networks Abstract : This paper proposes a control-based approach to the duty cycle adaptation for wireless sensor networks. The proposed method controls the duty cycle through the queue management to achieve high-performance under variable traffic rates. To have energy efficiency while minimizing the delay, we design a feedback controller, which adapts the sleep time to the traffic change dynamically by constraining the queue length at a predetermined value. In addition, we propose an efficient synchronization scheme using an active pattern, which represents the active time slot schedule for synchronization among sensor nodes, without affecting neighboring schedules. Based on the control theory, we analyze the adaptation behavior of the proposed controller and demonstrate system stability. The simulation results show that the proposed method outperforms existing schemes by achieving more power savings while minimizing the delay. 5. Cooperative Packet Delivery in Hybrid Wireless Mobile Networks: A Coalitional Game Approach Abstract : We consider the problem of cooperative packet delivery to mobile nodes in a hybrid wireless mobile network, where both infrastructure-based and infrastructure-less (i.e., ad hoc mode or peer-to-peer mode) communications are used. We propose a solution based on a coalition Contact : 044-43555140 9344399918/26 Page 3
  4. 4. TYPICAL SOFT TECHNOLOGIES formation among mobile nodes to cooperatively deliver packets among these mobile nodes in the same coalition. A coalitional game is developed to analyze the behavior of the rational mobile nodes for cooperative packet delivery. A group of mobile nodes makes a decision to join or to leave a coalition based on their individual payoffs. The individual payoff of each mobile node is a function of the average delivery delay for packets transmitted to the mobile node from a base station and the cost incurred by this mobile node for relaying packets to other mobile nodes. To find the payoff of each mobile node, a Markov chain model is formulated and the expected cost and packet delivery delay are obtained when the mobile node is in a coalition. Since both the expected cost and packet delivery delay depend on the probability that each mobile node will help other mobile nodes in the same coalition to forward packets to the destination mobile node in the same coalition, a bargaining game is used to find the optimal helping probabilities. After the payoff of each mobile node is obtained, we find the solutions of the coalitional game which are the stable coalitions. A distributed algorithm is presented to obtain the stable coalitions and a Markov-chain-based analysis is used to evaluate the stable coalitional structures obtained from the distributed algorithm. Performance evaluation results show that when the stable coalitions are formed, the mobile nodes achieve a nonzero payoff (i.e., utility is higher than the cost). With a coalition formation, the mobile nodes achieve higher payoff than that when each mobile node acts alone. 6. VAPR: Void-Aware Pressure Routing for Underwater Sensor Networks Abstract : Underwater mobile sensor networks have recently been proposed as a way to explore and observe the ocean, providing 4D (space and time) monitoring of underwater environments. We consider a specialized geographic routing problem called pressure routing that directs a packet to any sonobuoy on the surface based on depth information available from on-board pressure gauges. The main challenge of pressure routing in sparse underwater networks has been the efficient handling of 3D voids. In this respect, it was recently proven that the greedy stateless perimeter routing method, very popular in 2D networks, cannot be extended to void recovery in 3D networks. Available heuristics for 3D void recovery require expensive flooding. In this paper, Contact : 044-43555140 9344399918/26 Page 4
  5. 5. TYPICAL SOFT TECHNOLOGIES we propose a Void-Aware Pressure Routing (VAPR) protocol that uses sequence number, hop count and depth information embedded in periodic beacons to set up nexthop direction and to build a directional trail to the closest sonobuoy. Using this trail, opportunistic directional forwarding can be efficiently performed even in the presence of voids. The contribution of this paper is twofold: 1) a robust soft-state routing protocol that supports opportunistic directional forwarding; and 2) a new framework to attain loop freedom in static and mobile underwater networks to guarantee packet delivery. Extensive simulation results show that VAPR outperforms existing solutions. 7. DCIM: Distributed Cache Invalidation Method for Maintaining Cache Consistency in Wireless Mobile Networks Abstract : This paper proposes distributed cache invalidation mechanism (DCIM), a client-based cache consistency scheme that is implemented on top of a previously proposed architecture for caching data items in mobile ad hoc networks (MANETs), namely COACS, where special nodes cache the queries and the addresses of the nodes that store the responses to these queries. We have also previously proposed a server-based consistency scheme, named SSUM, whereas in this paper, we introduce DCIM that is totally client-based. DCIM is a pull-based algorithm that implements adaptive time to live (TTL), piggybacking, and prefetching, and provides near strong consistency capabilities. Cached data items are assigned adaptive TTL values that correspond to their update rates at the data source, where items with expired TTL values are grouped in validation requests to the data source to refresh them, whereas unexpired ones but with high request rates are prefetched from the server. In this paper, DCIM is analyzed to assess the delay and bandwidth gains (or costs) when compared to polling every time and push-based schemes. DCIM was also implemented using ns2, and compared against client-based and server-based schemes to assess its performance experimentally. The consistency ratio, delay, and overhead traffic are reported versus several variables, where DCIM showed to be superior when compared to the other systems. Contact : 044-43555140 9344399918/26 Page 5
  6. 6. TYPICAL SOFT TECHNOLOGIES 8. Cross-Layer Minimum-Delay Scheduling and Maximum-Throughput Resource Allocation for Multiuser Cognitive Networks Abstract : A cognitive network is considered that consists of a base station (BS) communicating with multiple primary and secondary users. Each secondary user can access only one of the orthogonal primary channels. A model is considered in which the primary users can tolerate a certain average delay. A special case is also considered in which the primary users do not suffer from any delay. A novel cross-layer scheme is proposed in which the BS performs successive interference cancellation and thus a secondary user can coexist with an active primary user without adversely affecting its transmission. A scheduling algorithm is proposed that minimizes the average packet delay of the secondary user under constraints on the average power transmitted by the secondary user and the average packet delay of the primary user. A resource allocation algorithm is also proposed to assign the secondary users’ channels such that the total throughput of the network is maximized. Our results indicate that the network throughput increases significantly by increasing the number of transmitted packets of the secondary users and/or by allowing a small delay for the primary user packets. 9. Scheduling Partition for Order Optimal Capacity in Large-Scale Wireless Networks Abstract : The capacity scaling property specifies the change of network throughput when network size increases. It serves as an essential performance metric in large-scale wireless networks. Existing results have been obtained based on the assumption of using a globally planned link transmission schedule in the network, which is however not feasible in large wireless networks due to the scheduling complexity. The gap between the well-known capacity results and the infeasible Contact : 044-43555140 9344399918/26 Page 6
  7. 7. TYPICAL SOFT TECHNOLOGIES assumption on link scheduling potentially undermines our understanding of the achievable network capacity. In this paper, we propose the scheduling partition methodology that decomposes a large network into small autonomous scheduling zones and implements a localized scheduling algorithm independently in each partition. We prove the sufficient and the necessary conditions for the scheduling partition approach to achieve the same order of capacity as the widely assumed global scheduling strategy. In comparison to the network dimension ffiffiffi n p , scheduling partition size ðrðnÞÞ is sufficient to obtain the optimal capacity scaling, where rðnÞ is the node transmission radius and much smaller than ffiffiffi n p . We finally propose a distributed partition protocol and a localized scheduling algorithm as our scheduling solution for maximum capacity in large wireless networks. 10.Video On-Demand Streaming in Cognitive Wireless Mesh Networks Abstract : Cognitive radio (CR), which enables dynamic access of underutilized licensed spectrums, is a promising technology for more efficient spectrum utilization. Since cognitive radio enables the access of larger amount of spectrum, it can be used to build wireless mesh networks with higher network capacity, and thus provide better quality of services for high bit-rate applications. In this paper, we study the multisource video on-demand application in multi-interface cognitive wireless mesh networks. Given a video request, we find a joint multipath routing and spectrum allocation for the session to minimize its total bandwidth cost in the network, and therefore maximize the number of sessions the network can support. We propose both distributed and centralized routing and channel allocation algorithms to solve the problem. Simulation results show that our algorithms increase the maximum number of concurrent sessions that can be supported in the network, and also improve each session’s performance with regard to spectrum mobility. Contact : 044-43555140 9344399918/26 Page 7
  8. 8. TYPICAL SOFT TECHNOLOGIES 11.Relay Selection for Geographical Forwarding in Sleep-Wake Cycling Wireless Sensor Networks Abstract : Our work is motivated by geographical forwarding of sporadic alarm packets to a base station in a wireless sensor network (WSN), where the nodes are sleep-wake cycling periodically and asynchronously. We seek to develop local forwarding algorithms that can be tuned so as to tradeoff the end-to-end delay against a total cost, such as the hop count or total energy. Our approach is to solve, at each forwarding node enroute to the sink, the local forwarding problem of minimizing one-hop waiting delay subject to a lower bound constraint on a suitable reward offered by the next-hop relay; the constraint serves to tune the tradeoff. The reward metric used for the local problem is based on the end-to-end total cost objective (for instance, when the total cost is hop count, we choose to use the progress toward sink made by a relay as the reward). The forwarding node, to begin with, is uncertain about the number of relays, their wake-up times, and the reward values, but knows the probability distributions of these quantities. At each relay wake-up instant, when a relay reveals its reward value, the forwarding node’s problem is to forward the packet or to wait for further relays to wake-up. In terms of the operations research literature, our work can be considered as a variant of the asset selling problem. We formulate our local forwarding problem as a partially observable Markov decision process (POMDP) and obtain inner and outer bounds for the optimal policy. Motivated by the computational complexity involved in the policies derived out of these bounds, we formulate an alternate simplified model, the optimal policy for which is a simple threshold rule. We provide simulation results to compare the performance of the inner and outer bound policies against the simple policy, and also against the optimal policy when the source knows the exact number of relays. Observing the good performance and the ease of implementation of the simple policy, we apply it to our motivating problem, i.e., local geographical routing of sporadic alarm packets in a large WSN. We compare the end-to-end performance (i.e., average total delay and average total cost) obtained by the simple policy, when used for local geographical forwarding, against that obtained by the globally optimal forwarding algorithm proposed by Kim. Contact : 044-43555140 9344399918/26 Page 8
  9. 9. TYPICAL SOFT TECHNOLOGIES 12.Adaptive Position Update for Geographic Routing in Mobile Ad Hoc Networks Abstract : In geographic routing, nodes need to maintain up-to-date positions of their immediate neighbors for making effective forwarding decisions. Periodic broadcasting of beacon packets that contain the geographic location coordinates of the nodes is a popular method used by most geographic routing protocols to maintain neighbor positions. We contend and demonstrate that periodic beaconing regardless of the node mobility and traffic patterns in the network is not attractive from both update cost and routing performance points of view. We propose the Adaptive Position Update (APU) strategy for geographic routing, which dynamically adjusts the frequency of position updates based on the mobility dynamics of the nodes and the forwarding patterns in the network. APU is based on two simple principles: 1) nodes whose movements are harder to predict update their positions more frequently (and vice versa), and (ii) nodes closer to forwarding paths update their positions more frequently (and vice versa). Our theoretical analysis, which is validated by NS2 simulations of a well-known geographic routing protocol, Greedy Perimeter Stateless Routing Protocol (GPSR), shows that APU can significantly reduce the update cost and improve the routing performance in terms of packet delivery ratio and average end-to-end delay in comparison with periodic beaconing and other recently proposed updating schemes. The benefits of APU are further confirmed by undertaking evaluations in realistic network scenarios, which account for localization error, realistic radio propagation, and sparse network. 13.Channel Allocation and Routing in Hybrid Multichannel Multiradio Wireless Mesh Networks Abstract : Contact : 044-43555140 9344399918/26 Page 9
  10. 10. TYPICAL SOFT TECHNOLOGIES Many efforts have been devoted to maximizing network throughput in a multichannel multiradio wireless mesh network. Most current solutions are based on either purely static or purely dynamic channel allocation approaches. In this paper, we propose a hybrid multichannel multiradio wireless mesh networking architecture, where each mesh node has both static and dynamic interfaces. We first present an Adaptive Dynamic Channel Allocation protocol (ADCA), which considers optimization for both throughput and delay in the channel assignment. In addition, we also propose an Interference and Congestion Aware Routing protocol (ICAR) in the hybrid network with both static and dynamic links, which balances the channel usage in the network. Our simulation results show that compared to previous works, ADCA reduces the packet delay considerably without degrading the network throughput. The hybrid architecture shows much better adaptivity to changing traffic than purely static architecture without dramatic increase in overhead, and achieves lower delay than existing approaches for hybrid networks. 14.Toward Privacy Preserving and Collusion Resistance in a Location Proof Updating System Abstract : Today’s location-sensitive service relies on user’s mobile device to determine the current location. This allows malicious users to access a restricted resource or provide bogus alibis by cheating on their locations. To address this issue, we propose A Privacy-Preserving LocAtion proof Updating System (APPLAUS) in which colocated Bluetooth enabled mobile devices mutually generate location proofs and send updates to a location proof server. Periodically changed pseudonyms are used by the mobile devices to protect source location privacy from each other, and from the untrusted location proof server. We also develop user-centric location privacy model in which individual users evaluate their location privacy levels and decide whether and when to accept the location proof requests. In order to defend against colluding attacks, we also present betweenness ranking-based and correlation clustering-based approaches for outlier detection. APPLAUS can be implemented with existing network infrastructure, and can be easily deployed in Bluetooth enabled mobile devices with little computation or power Contact : 044-43555140 9344399918/26 Page 10
  11. 11. TYPICAL SOFT TECHNOLOGIES cost. Extensive experimental results show that APPLAUS can effectively provide location proofs, significantly preserve the source location privacy, and effectively detect colluding attacks. 15.SSD: A Robust RF Location Fingerprint Addressing Mobile Devices’ Heterogeneity Abstract : Fingerprint-based methods are widely adopted for indoor localization purpose because of their cost-effectiveness compared to other infrastructure-based positioning systems. However, the popular location fingerprint, Received Signal Strength (RSS), is observed to differ significantly across different devices’ hardware even under the same wireless conditions. We derive analytically a robust location fingerprint definition, the Signal Strength Difference (SSD), and verify its performance experimentally using a number of different mobile devices with heterogeneous hardware. Our experiments have also considered both Wi-Fi and Bluetooth devices, as well as both Access-Point(AP)-based localization and Mobile-Node (MN)-assisted localization. We present the results of two well-known localization algorithms (K Nearest Neighbor and Bayesian Inference) when our proposed fingerprint is used, and demonstrate its robustness when the testing device differs from the training device. We also compare these SSDbased localization algorithms’ performance against that of two other approaches in the literature that are designed to mitigate the effects of mobile node hardware variations, and show that SSDbased algorithms have better accuracy. 16.EMAP: Expedite Message Authentication Protocol for Vehicular Ad Hoc Networks Abstract : Contact : 044-43555140 9344399918/26 Page 11
  12. 12. TYPICAL SOFT TECHNOLOGIES Vehicular ad hoc networks (VANETs) adopt the Public Key Infrastructure (PKI) and Certificate Revocation Lists (CRLs) for their security. In any PKI system, the authentication of a received message is performed by checking if the certificate of the sender is included in the current CRL, and verifying the authenticity of the certificate and signature of the sender. In this paper, we propose an Expedite Message Authentication Protocol (EMAP) for VANETs, which replaces the time-consuming CRL checking process by an efficient revocation checking process. The revocation check process in EMAP uses a keyed Hash Message Authentication Code ðHMACÞ, where the key used in calculating theHMAC is shared only between nonrevoked OnBoard Units (OBUs). In addition, EMAP uses a novel probabilistic key distribution, which enables nonrevoked OBUs to securely share and update a secret key. EMAP can significantly decrease the message loss ratio due to the message verification delay compared with the conventional authentication methods employing CRL. By conducting security analysis and performance evaluation,EMAP is demonstrated to be secure and efficient. 17.Channel Assignment for Throughput Optimization in Multichannel Multiradio Wireless Mesh Networks Using Network Coding Abstract : Compared to single-hop networks such as WiFi, multihop infrastructure wireless mesh networks (WMNs) can potentially embrace the broadcast benefits of a wireless medium in a more flexible manner. Rather than being point-to-point, links in the WMNs may originate from a single node and reach more than one other node. Nodes located farther than a one-hop distance and overhearing such transmissions may opportunistically help relay packets for previous hops. This phenomenon is called opportunistic overhearing/ listening. With multiple radios, a node can also improve its capacity by transmitting over multiple radios simultaneously using orthogonal channels. Capitalizing on these potential advantages requires effective routing and efficient mapping of channels to radios (channel assignment (CA)). While efficient channel assignment can greatly reduce interference from nearby transmitters, effective routing can potentially relieve congestion on paths to the infrastructure. Routing, however, requires that only packets pertaining Contact : 044-43555140 9344399918/26 Page 12
  13. 13. TYPICAL SOFT TECHNOLOGIES to a particular connection be routed on a predetermined route. Random network coding (RNC) breaks this constraint by allowing nodes to randomly mix packets overheard so far before forwarding. A relay node thus only needs to know how many packets, and not which packets, it should send. We mathematically formulate the joint problem of random network coding, channel assignment, and broadcast link scheduling, taking into account opportunistic overhearing, the interference constraints, the coding constraints, the number of orthogonal channels, the number of radios per node, and fairness among unicast connections. Based on this formulation, we develop a suboptimal, auction-based solution for overall network throughput optimization. Performance evaluation results show that our algorithm can effectively exploit multiple radios and channels and can cope with fairness issues arising from auctions. Our algorithm also shows promising gains over traditional routing solutions in which various channel assignment strategies are used. 18.Content Sharing over Smartphone-Based Delay-Tolerant Networks Abstract : With the growing number of smartphone users, peer-to-peer ad hoc content sharing is expected to occur more often. Thus, new content sharing mechanisms should be developed as traditional data delivery schemes are not efficient for content sharing due to the sporadic connectivity between smartphones. To accomplish data delivery in such challenging environments, researchers have proposed the use of store-carry-forward protocols, in which a node stores a message and carries it until a forwarding opportunity arises through an encounter with other nodes. Most previous works in this field have focused on the prediction of whether two nodes would encounter each other, without considering the place and time of the encounter. In this paper, we propose discover-predict-deliver as an efficient content sharing scheme for delay-tolerant smartphone networks. In our proposed scheme, contents are shared using the mobility information of individuals. Specifically, our approach employs a mobility learning algorithm to identify places indoors and outdoors. A hidden Markov model is used to predict an individual’s future mobility information. Evaluation based on real traces indicates that with the Contact : 044-43555140 9344399918/26 Page 13
  14. 14. TYPICAL SOFT TECHNOLOGIES proposed approach, 87 percent of contents can be correctly discovered and delivered within 2 hours when the content is available only in 30 percent of nodes in the network. We implement a sample application on commercial smartphones, and we validate its efficiency to analyze the practical feasibility of the content sharing application. Our system approximately results in a 2 percent CPU overhead and reduces the battery lifetime of a smartphone by 15 percent at most. 19.Discovery and Verification of Neighbor Positions in Mobile Ad Hoc Networks Abstract : A growing number of ad hoc networking protocols and location-aware services require that mobile nodes learn the position of their neighbors. However, such a process can be easily abused or disrupted by adversarial nodes. In absence of a priori trusted nodes, the discovery and verification of neighbor positions presents challenges that have been scarcely investigated in the literature. In this paper, we address this open issue by proposing a fully distributed cooperative solution that is robust against independent and colluding adversaries, and can be impaired only by an overwhelming presence of adversaries. Results show that our protocol can thwart more than 99 percent of the attacks under the best possible conditions for the adversaries, with minimal false positive rates. 20.Mobile Relay Configuration in Data-Intensive Wireless Sensor Networks Abstract : Wireless Sensor Networks (WSNs) are increasingly used in data-intensive applications such as microclimate monitoring, precision agriculture, and audio/video surveillance. A key challenge faced by data-intensive WSNs is to transmit all the data generated within an application’s lifetime to the base station despite the fact that sensor nodes have limited power supplies. We propose using lowcost disposable mobile relays to reduce the energy consumption of dataContact : 044-43555140 9344399918/26 Page 14
  15. 15. TYPICAL SOFT TECHNOLOGIES intensive WSNs. Our approach differs from previous work in two main aspects. First, it does not require complex motion planning of mobile nodes, so it can be implemented on a number of lowcost mobile sensor platforms. Second, we integrate the energy consumption due to both mobility and wireless transmissions into a holistic optimization framework. Our framework consists of three main algorithms. The first algorithm computes an optimal routing tree assuming no nodes can move. The second algorithm improves the topology of the routing tree by greedily adding new nodes exploiting mobility of the newly added nodes. The third algorithm improves the routing tree by relocating its nodes without changing its topology. This iterative algorithm converges on the optimal position for each node given the constraint that the routing tree topology does not change. We present efficient distributed implementations for each algorithm that require only limited, localized synchronization. Because we do not necessarily compute an optimal topology, our final routing tree is not necessarily optimal. However, our simulation results show that our algorithms significantly outperform the best existing solutions. 21.Vampire Attacks: Draining Life from Wireless Ad Hoc Sensor Networks Abstract : Ad hoc low-power wireless networks are an exciting research direction in sensing and pervasive computing. Prior security work in this area has focused primarily on denial of communication at the routing or medium access control levels. This paper explores resource depletion attacks at the routing protocol layer, which permanently disable networks by quickly draining nodes’ battery power. These ―Vampire‖ attacks are not specific to any specific protocol, but rather rely on the properties of many popular classes of routing protocols. We find that all examined protocols are susceptible to Vampire attacks, which are devastating, difficult to detect, and are easy to carry out using as few as one malicious insider sending only protocol-compliant messages. In the worst case, a single Vampire can increase network-wide energy usage by a factor of OðNÞ, where N in the number of network nodes. We discuss methods to mitigate these types of attacks, including a new proof-of-concept protocol that provably bounds the damage caused by Vampires during the packet forwarding phase. Contact : 044-43555140 9344399918/26 Page 15
  16. 16. TYPICAL SOFT TECHNOLOGIES CLOUD COMPUTING 1. Optimal Multiserver Configuration for Profit Maximization in Cloud Computing Abstract : As cloud computing becomes more and more popular, understanding the economics of cloud computing becomes critically important. To maximize the profit, a service provider should understand both service charges and business costs, and how they are determined by the characteristics of the applications and the configuration of a multiserver system. The problem of optimal multiserver configuration for profit maximization in a cloud computing environment is studied. Our pricing model takes such factors into considerations as the amount of a service, the workload of an application environment, the configuration of a multiserver system, the servicelevel agreement, the satisfaction of a consumer, the quality of a service, the penalty of a lowquality service, the cost of renting, the cost of energy consumption, and a service provider’s margin and profit. Our approach is to treat a multiserver system as an M/M/m queuing model, such that our optimization problem can be formulated and solved analytically. Two server speed and power consumption models are considered, namely, the idle-speed model and the constantspeed model. The probability density function of the waiting time of a newly arrived service request is derived. The expected service charge to a service request is calculated. The expected net business gain in one unit of time is obtained. Numerical calculations of the optimal server size and the optimal server speed are demonstrated. Contact : 044-43555140 9344399918/26 Page 16
  17. 17. TYPICAL SOFT TECHNOLOGIES 2. Efficient Resource Mapping Framework over Networked Clouds via Iterated Local Search-Based Request Partitioning Abstract : The cloud represents a computing paradigm where shared configurable resources are provided as a service over the Internet. Adding intra- or intercloud communication resources to the resource mix leads to a networked cloud computing environment. Following the cloud infrastructure as a Service paradigm and in order to create a flexible management framework, it is of paramount importance to address efficiently the resource mapping problem within this context. To deal with the inherent complexity and scalability issue of the resource mapping problem across different administrative domains, in this paper a hierarchical framework is described. First, a novel request partitioning approach based on Iterated Local Search is introduced that facilitates the cost-efficient and online splitting of user requests among eligible cloud service providers (CPs) within a networked cloud environment. Following and capitalizing on the outcome of the request partitioning phase, the embedding phase—where the actual mapping of requested virtual to physical resources is performed can be realized through the use of a distributed intracloud resource mapping approach that allows for efficient and balanced allocation of cloud resources. Finally, a thorough evaluation of the proposed overall framework on a simulated networked cloud environment is provided and critically compared against an exact request partitioning solution as well as another common intradomain virtual resource embedding solution. Contact : 044-43555140 9344399918/26 Page 17
  18. 18. TYPICAL SOFT TECHNOLOGIES 3. Harnessing the Cloud for Securely Outsourcing Large-Scale Systems of Linear Equations Abstract : Cloud computing economically enables customers with limited computational resources to outsource large-scale computations to the cloud. However, how to protect customers’ confidential data involved in the computations then becomes a major security concern. In this paper, we present a secure outsourcing mechanism for solving large-scale systems of linear equations (LE) in cloud. Because applying traditional approaches like Gaussian elimination or LU decomposition (aka. direct method) to such large-scale LEs would be prohibitively expensive, we build the secure LE outsourcing mechanism via a completely different approach— iterative method, which is much easier to implement in practice and only demands relatively simpler matrix-vector operations. Specifically, our mechanism enables a customer to securely harness the cloud for iteratively finding successive approximations to the LE solution, while keeping both the sensitive input and output of the computation private. For robust cheating detection, we further explore the algebraic property of matrix-vector operations and propose an efficient result verification mechanism, which allows the customer to verify all answers received from previous iterative approximations in one batch with high probability. Thorough security analysis and prototype experiments on Amazon EC2 demonstrate the validity and practicality of our proposed design. 4. QoS Ranking Prediction for Cloud Services Abstract : Cloud computing is becoming popular. Building high-quality cloud applications is a critical research problem. QoS rankings provide valuable information for making optimal cloud service selection from a set of functionally equivalent service candidates. To obtain QoS values, realContact : 044-43555140 9344399918/26 Page 18
  19. 19. TYPICAL SOFT TECHNOLOGIES world invocations on the service candidates are usually required. To avoid the time-consuming and expensive real-world service invocations, this paper proposes a QoS ranking prediction framework for cloud services by taking advantage of the past service usage experiences of other consumers. Our proposed framework requires no additional invocations of cloud services when making QoS ranking prediction. Two personalized QoS ranking prediction approaches are proposed to predict the QoS rankings directly. Comprehensive experiments are conducted employing real-world QoS data, including 300 distributed users and 500 realworld web services all over the world. The experimental results show that our approaches outperform other competing approaches. 5. Cloudy with a Chance of Cost Savings Abstract : Cloud-based hosting is claimed to possess many advantages over traditional in-house (onpremise) hosting such as better scalability, ease of management, and cost savings. It is not difficult to understand how cloud-based hosting can be used to address some of the existing limitations and extend the capabilities of many types of applications. However, one of the most important questions is whether cloud-based hosting will be economically feasible for my application if migrated into the cloud. It is not straightforward to answer this question because it is not clear how my application will benefit from the claimed advantages, and, in turn, be able to convert them into tangible cost savings. Within cloud-based hosting offerings, there is a wide range of hosting options one can choose from, each impacting the cost in a different way. Answering these questions requires an in-depth understanding of the cost implications of all the possible choices specific to my circumstances. In this study, we identify a diverse set of key factors affecting the costs of deployment choices. Using benchmarks representing two different applications (TPC-W and TPC-E) we investigate the evolution of costs for different deployment Contact : 044-43555140 9344399918/26 Page 19
  20. 20. TYPICAL SOFT TECHNOLOGIES choices. We consider important application characteristics such as workload intensity, growth rate, traffic size, storage, and software license to understand their impact on the overall costs. We also discuss the impact of workload variance and cloud elasticity, and certain cost factors that are subjective in nature. 6. Error-Tolerant Resource Allocation and Payment Minimization for Cloud System Abstract : With virtual machine (VM) technology being increasingly mature, compute resources in cloud systems can be partitioned in fine granularity and allocated on demand. We make three contributions in this paper: 1) We formulate a deadline-driven resource allocation problem based on the cloud environment facilitated with VM resource isolation technology, and also propose a novel solution with polynomial time, which could minimize users’ payment in terms of their expected deadlines. 2) By analyzing the upper bound of task execution length based on the possibly inaccurate workload prediction, we further propose an error-tolerant method to guarantee task’s completion within its deadline. 3) We validate its effectiveness over a real VMfacilitated cluster environment under different levels of competition. In our experiment, by tuning algorithmic input deadline based on our derived bound, task execution length can always be limited within its deadline in the sufficient-supply situation; the mean execution length still keeps 70 percent as high as userspecified deadline under the severe competition. Under the original-deadline-based solution, about 52.5 percent of tasks are completed within 0.95-1.0 as high as their deadlines, which still conforms to the deadline-guaranteed requirement. Only 20 percent of tasks violate deadlines, yet most (17.5 percent) are still finished within 1.05 times of deadlines. Contact : 044-43555140 9344399918/26 Page 20
  21. 21. TYPICAL SOFT TECHNOLOGIES 7. Mona: Secure Multi-Owner Data Sharing for Dynamic Groups in the Cloud Abstract : With the character of low maintenance, cloud computing provides an economical and efficient solution for sharing group resource among cloud users. Unfortunately, sharing data in a multi-owner manner while preserving data and identity privacy from an untrusted cloud is still a challenging issue, due to the frequent change of the membership. In this paper, we propose a secure multiowner data sharing scheme, named Mona, for dynamic groups in the cloud. By leveraging group signature and dynamic broadcast encryption techniques, any cloud user can anonymously share data with others. Meanwhile, the storage overhead and encryption computation cost of our scheme are independent with the number of revoked users. In addition, we analyze the security of our scheme with rigorous proofs, and demonstrate the efficiency of our scheme in experiments. 8. A New Disk I/O Model of Virtualized Cloud Environment Abstract : In a traditional virtualized cloud environment, using asynchronous I/O in the guest file system and synchronous I/O in the host file system to handle an asynchronous user disk write exhibits several drawbacks, such as performance disturbance among different guests and consistency maintenance across guest failures. To improve these issues, this paper introduces a novel disk I/O model for virtualized cloud system called HypeGear, where the guest file system uses synchronous operations to deal with the guest write request and the host file system performs asynchronous operations to write the data to the hard disk. A prototype system is implemented on the Xen hypervisor and our experimental results verify that this new model has Contact : 044-43555140 9344399918/26 Page 21
  22. 22. TYPICAL SOFT TECHNOLOGIES many advantages over the conventional asynchronous-synchronous model. We also evaluate the overhead of asynchronous I/O at host, which is brought by our new model. The result demonstrates that it enforces little cost on host layer. 9. On Data Staging Algorithms for Shared Data Accesses in Clouds Abstract : In this paper, we study the strategies for efficiently achieving data staging and caching on a set of vantage sites in a cloud system with a minimum cost. Unlike the traditional research, we do not intend to identify the access patterns to facilitate the future requests. Instead, with such a kind of information presumably known in advance, our goal is to efficiently stage the shared data items to predetermined sites at advocated time instants to align with the patterns while minimizing the monetary costs for caching and transmitting the requested data items. To this end, we follow the cost and network models in [1] and extend the analysis to multiple data items, each with single or multiple copies. Our results show that under homogeneous cost model, when the ratio of transmission cost and caching cost is low, a single copy of each data item can efficiently serve all the user requests. While in multicopy situation, we also consider the tradeoff between the transmission cost and caching cost by controlling the upper bounds of transmissions and copies. The upper bound can be given either on per-item basis or on all-item basis. We present efficient optimal solutions based on dynamic programming techniques to all these cases provided that the upper bound is polynomially bounded by the number of service requests and the number of distinct data items. In addition to the homogeneous cost model, we also briefly discuss this problem under a heterogeneous cost model with some simple yet practical restrictions and present a 2-approximation algorithm to the general case. We validate our findings by implementing a data staging solver, whereby conducting extensive simulation studies on the behaviors of the algorithms. Contact : 044-43555140 9344399918/26 Page 22
  23. 23. TYPICAL SOFT TECHNOLOGIES 10.Dynamic Optimization of Multiattribute Resource Allocation in SelfOrganizing Clouds Abstract : By leveraging virtual machine (VM) technology which provides performance and fault isolation, cloud resources can be provisioned on demand in a fine grained, multiplexed manner rather than in monolithic pieces. By integrating volunteer computing into cloud architectures, we envision a gigantic self-organizing cloud (SOC) being formed to reap the huge potential of untapped commodity computing power over the Internet. Toward this new architecture where each participant may autonomously act as both resource consumer and provider, we propose a fully distributed, VM-multiplexing resource allocation scheme to manage decentralized resources. Our approach not only achieves maximized resource utilization using the proportional share model (PSM), but also delivers provably and adaptively optimal execution efficiency. We also design a novel multiattribute range query protocol for locating qualified nodes. Contrary to existing solutions which often generate bulky messages per request, our protocol produces only one lightweight query message per task on the Content Addressable Network (CAN). It works effectively to find for each task its qualified resources under a randomized policy that mitigates the contention among requesters. We show the SOC with our optimized algorithms can make an improvement by 15-60 percent in system throughput than a P2P Grid model. Our solution also exhibits fairly high adaptability in a dynamic node-churning environment. Contact : 044-43555140 9344399918/26 Page 23
  24. 24. TYPICAL SOFT TECHNOLOGIES 11.Scalable and Secure Sharing of Personal Health Records in Cloud Computing Using Attribute-Based Encryption Abstract : Personal health record (PHR) is an emerging patient-centric model of health information exchange, which is often outsourced to be stored at a third party, such as cloud providers. However, there have been wide privacy concerns as personal health information could be exposed to those third party servers and to unauthorized parties. To assure the patients’ control over access to their own PHRs, it is a promising method to encrypt the PHRs before outsourcing. Yet, issues such as risks of privacy exposure, scalability in key management, flexible access, and efficient user revocation, have remained the most important challenges toward achieving finegrained, cryptographically enforced data access control. In this paper, we propose a novel patient-centric framework and a suite of mechanisms for data access control to PHRs stored in semitrusted servers. To achieve fine-grained and scalable data access control for PHRs, we leverage attribute-based encryption (ABE) techniques to encrypt each patient’s PHR file. Different from previous works in secure data outsourcing, we focus on the multiple data owner scenario, and divide the users in the PHR system into multiple security domains that greatly reduces the key management complexity for owners and users. A high degree of patient privacy is guaranteed simultaneously by exploiting multiauthority ABE. Our scheme also enables dynamic modification of access policies or file attributes, supports efficient on-demand user/attribute revocation and break-glass access under emergency scenarios. Extensive analytical and experimental results are presented which show the security, scalability, and efficiency of our proposed scheme. Contact : 044-43555140 9344399918/26 Page 24
  25. 25. TYPICAL SOFT TECHNOLOGIES PARALLEL AND DISTRIBUTED SYSTEMS 1. A Truthful Dynamic Workflow Scheduling Mechanism for Commercial Multicloud Environments Abstract : The ultimate goal of cloud providers by providing resources is increasing their revenues. This goal leads to a selfish behavior that negatively affects the users of a commercial multicloud environment. In this paper, we introduce a pricing model and a truthful mechanism for scheduling single tasks considering two objectives: monetary cost and completion time. With respect to the social cost of the mechanism, i.e., minimizing the completion time and monetary cost, we extend the mechanism for dynamic scheduling of scientific workflows. We theoretically analyze the truthfulness and the efficiency of the mechanism and present extensive experimental results showing significant impact of the selfish behavior of the cloud providers on the efficiency of the whole system. The experiments conducted using real-world and synthetic workflow applications demonstrate that our solutions dominate in most cases the Pareto-optimal solutions estimated by two classical multiobjective evolutionary algorithms. 2. Anchor: A Versatile and Efficient Framework for Resource Management in the Cloud Abstract : We present Anchor, a general resource management architecture that uses the stable matching framework to decouple policies from mechanisms when mapping virtual machines to physical servers. In Anchor, clients and operators are able to express a variety of distinct resource management policies as they deem fit, and these policies are captured as preferences in the stable matching framework. The highlight of Anchor is a new many-to-one stable matching theory that efficiently matches VMs with heterogeneous resource needs to servers, using both offline and online algorithms. Our theoretical analyses show the convergence and optimality of Contact : 044-43555140 9344399918/26 Page 25
  26. 26. TYPICAL SOFT TECHNOLOGIES the algorithm. Our experiments with a prototype implementation on a 20-node server cluster, as well as large-scale simulations based on real-world workload traces, demonstrate that the architecture is able to realize a diverse set of policy objectives with good performance and practicality. 3. A Highly Practical Approach toward Achieving Minimum Data Sets Storage Cost in the Cloud Abstract : Massive computation power and storage capacity of cloud computing systems allow scientists to deploy computation and data intensive applications without infrastructure investment, where large application data sets can be stored in the cloud. Based on the pay-asyou-go model, storage strategies and benchmarking approaches have been developed for costeffectively storing large volume of generated application data sets in the cloud. However, they are either insufficiently cost-effective for the storage or impractical to be used at runtime. In this paper, toward achieving the minimum cost benchmark, we propose a novel highly costeffective and practical storage strategy that can automatically decide whether a generated data set should be stored or not at runtime in the cloud. The main focus of this strategy is the local-optimization for the tradeoff between computation and storage, while secondarily also taking users’ (optional) preferences on storage into consideration. Both theoretical analysis and simulations conducted on general (random) data sets as well as specific real world applications with Amazon’s cost model show that the costeffectiveness of our strategy is close to or even the same as the minimum cost benchmark, and the efficiency is very high for practical runtime utilization in the cloud. 4. Toward Fine-Grained, Unsupervised, Scalable Performance Diagnosis for Production Cloud Computing Systems Abstract : Contact : 044-43555140 9344399918/26 Page 26
  27. 27. TYPICAL SOFT TECHNOLOGIES Performance diagnosis is labor intensive in production cloud computing systems. Such systems typically face many realworld challenges, which the existing diagnosis techniques for such distributed systems cannot effectively solve. An efficient, unsupervised diagnosis tool for locating fine-grained performance anomalies is still lacking in production cloud computing systems. This paper proposes CloudDiag to bridge this gap. Combining a statistical technique and a fast matrix recovery algorithm, CloudDiag can efficiently pinpoint fine-grained causes of the performance problems, which does not require any domain-specific knowledge to the target system. CloudDiag has been applied in a practical production cloud computing systems to diagnose performance problems. We demonstrate the effectiveness of CloudDiag in three realworld case studies. 5. Scalable and Accurate Graph Clustering and Community Structure Detection Abstract : One of the most useful measures of cluster quality is the modularity of the partition, which measures the difference between the number of the edges joining vertices from the same cluster and the expected number of such edges in a random graph. In this paper, we show that the problem of finding a partition maximizing the modularity of a given graph G can be reduced to a minimum weighted cut (MWC) problem on a complete graph with the same vertices as G. We then show that the resulting minimum cut problem can be efficiently solved by adapting existing graph partitioning techniques. Our algorithm finds clusterings of a comparable quality and is much faster than the existing clustering algorithms. 6. Load Rebalancing for Distributed File Systems in Clouds Abstract : Distributed file systems are key building blocks for cloud computing applications based on the MapReduce programming paradigm. In such file systems, nodes simultaneously serve Contact : 044-43555140 9344399918/26 Page 27
  28. 28. TYPICAL SOFT TECHNOLOGIES computing and storage functions; a file is partitioned into a number of chunks allocated in distinct nodes so that MapReduce tasks can be performed in parallel over the nodes. However, in a cloud computing environment, failure is the norm, and nodes may be upgraded, replaced, and added in the system. Files can also be dynamically created, deleted, and appended. This results in load imbalance in a distributed file system; that is, the file chunks are not distributed as uniformly as possible among the nodes. Emerging distributed file systems in production systems strongly depend on a central node for chunk reallocation. This dependence is clearly inadequate in a large-scale, failure-prone environment because the central load balancer is put under considerable workload that is linearly scaled with the system size, and may thus become the performance bottleneck and the single point of failure. In this paper, a fully distributed load rebalancing algorithm is presented to cope with the load imbalance problem. Our algorithm is compared against a centralized approach in a production system and a competing distributed solution presented in the literature. The simulation results indicate that our proposal is comparable with the existing centralized approach and considerably outperforms the prior distributed algorithm in terms of load imbalance factor, movement cost, and algorithmic overhead. The performance of our proposal implemented in the Hadoop distributed file system is further investigated in a cluster environment. 7. SPOC: A Secure and Privacy-Preserving Opportunistic Computing Framework for Mobile-Healthcare Emergency Abstract : With the pervasiveness of smart phones and the advance of wireless body sensor networks (BSNs), mobile Healthcare (m-Healthcare), which extends the operation of Healthcare provider into a pervasive environment for better health monitoring, has attracted considerable interest recently. However, the flourish of m-Healthcare still faces many challenges including information security and privacy preservation. In this paper, we propose a secure and privacypreserving opportunistic computing framework, called SPOC, for m-Healthcare emergency. With SPOC, smart phone resources including computing power and energy can be Contact : 044-43555140 9344399918/26 Page 28
  29. 29. TYPICAL SOFT TECHNOLOGIES opportunistically gathered to process the computing-intensive personal health information (PHI) during m-Healthcare emergency with minimal privacy disclosure. In specific, to leverage the PHI privacy disclosure and the high reliability of PHI process and transmission in m-Healthcare emergency, we introduce an efficient user-centric privacy access control in SPOC framework, which is based on an attribute-based access control and a new privacy-preserving scalar product computation (PPSPC) technique, and allows a medical user to decide who can participate in the opportunistic computing to assist in processing his overwhelming PHI data. Detailed security analysis shows that the proposed SPOC framework can efficiently achieve user-centric privacy access control in m- Healthcare emergency. In addition, performance evaluations via extensive simulations demonstrate the SPOC’s effectiveness in term of providing high-reliable-PHI process and transmission while minimizing the privacy disclosure during m-Healthcare emergency. 8. Improve Efficiency and Reliability in Single-Hop WSNs with TransmitOnly Nodes Abstract : Wireless Sensor Networks (WSNs) will play a significant role at the ―edge‖ of the future ―Internet of Things.‖ In particular, WSNs with transmit-only nodes are attracting more attention due to their advantages in supporting applications requiring dense and long-lasting deployment at a very low cost and energy consumption. However, the lack of receivers in transmit-only nodes renders most existing MAC protocols invalid. Based on our previous study on WSNs with pure transmit-only nodes, this work proposes a simple, yet cost effective and powerful single-hop hybrid WSN cluster architecture that contains not only transmit-only nodes but also standard nodes (with transceivers). Along with the hybrid architecture, this work also proposes a new MAC layer protocol framework called Robust Asynchronous Resource Estimation (RARE) that efficiently and reliably manages the densely deployed single-hop hybrid cluster in a selforganized fashion. Through analysis and extensive simulations, the proposed framework is shown to meet or exceed the needs of most applications in terms of the data delivery probability, Contact : 044-43555140 9344399918/26 Page 29
  30. 30. TYPICAL SOFT TECHNOLOGIES QoS differentiation, system capacity, energy consumption, and reliability. To the best of our knowledge, this work is the first that brings reliable scheduling to WSNs containing both nonsynchronized transmit-only nodes and standard nodes. 9. Optimal Client-Server Assignment for Internet Distributed Systems Abstract : We investigate an underlying mathematical model and algorithms for optimizing the performance of a class of distributed systems over the Internet. Such a system consists of a large number of clients who communicate with each other indirectly via a number of intermediate servers. Optimizing the overall performance of such a system then can be formulated as a clientserver assignment problem whose aim is to assign the clients to the servers in such a way to satisfy some prespecified requirements on the communication cost and load balancing. We show that 1) the total communication load and load balancing are two opposing metrics, and consequently, their tradeoff is inherent in this class of distributed systems; 2) in general, finding the optimal client-server assignment for some prespecified requirements on the total load and load balancing is NP-hard, and therefore; 3) we propose a heuristic via relaxed convex optimization for finding the approximate solution. Our simulation results indicate that the proposed algorithm produces superior performance than other heuristics, including the popular Normalized Cuts algorithm. 10.Fast Channel Zapping with Destination-Oriented Multicast for IP Video Delivery Abstract : Channel zapping time is a critical quality of experience (QoE) metric for IP-based video delivery systems such as IPTV. An interesting zapping acceleration scheme based on timeshifted subchannels (TSS) was recently proposed, which can ensure a zapping delay bound as Contact : 044-43555140 9344399918/26 Page 30
  31. 31. TYPICAL SOFT TECHNOLOGIES well as maintain the picture quality during zapping. However, the behaviors of the TSS-based scheme have not been fully studied yet. Furthermore, the existing TSS-based implementation adopts the traditional IP multicast, which is not scalable for a large-scale distributed system. Corresponding to such issues, this paper makes contributions in two aspects. First, we resort to theoretical analysis to understand the fundamental properties of the TSS-based service model. We show that there exists an optimal subchannel data rate which minimizes the redundant traffic transmitted over subchannels. Moreover, we reveal a start-up effect, where the existing operation pattern in the TSS-based model could violate the zapping delay bound. With a solution proposed to resolve the start-up effect, we rigorously prove that a zapping delay bound equal to the subchannel time shift is guaranteed by the updated TSS-based model. Second, we propose a destination-oriented-multicast (DOM) assisted zapping acceleration (DAZA) scheme for a scalable TSS-based implementation, where a subscriber can seamlessly migrate from a subchannel to the main channel after zapping without any control message exchange over the network. Moreover, the subchannel selection in DAZA is independent of the zapping request signaling delay, resulting in improved robustness and reduced messaging overhead in a distributed environment. We implement DAZA in ns-2 and multicast an MPEG-4 video stream over a practical network topology. Extensive simulation results are presented to demonstrate the validity of our analysis and DAZA scheme. 11.Cluster-Based Certificate Revocation with Vindication Capability for Mobile Ad Hoc Networks Abstract : Mobile ad hoc networks (MANETs) have attracted much attention due to their mobility and ease of deployment. However, the wireless and dynamic natures render them more vulnerable to various types of security attacks than the wired networks. The major challenge is to guarantee secure network services. To meet this challenge, certificate revocation is an important integral component to secure network communications. In this paper, we focus on the issue of certificate revocation to isolate attackers from further participating in network activities. For quick and Contact : 044-43555140 9344399918/26 Page 31
  32. 32. TYPICAL SOFT TECHNOLOGIES accurate certificate revocation, we propose the Cluster-based Certificate Revocation with Vindication Capability (CCRVC) scheme. In particular, to improve the reliability of the scheme, we recover the warned nodes to take part in the certificate revocation process; to enhance the accuracy, we propose the threshold-based mechanism to assess and vindicate warned nodes as legitimate nodes or not, before recovering them. The performances of our scheme are evaluated by both numerical and simulation analysis. Extensive results demonstrate that the proposed certificate revocation scheme is effective and efficient to guarantee secure communications in mobile ad hoc networks. 12.A Secure Protocol for Spontaneous Wireless Ad Hoc Networks Creation Abstract : This paper presents a secure protocol for spontaneous wireless ad hoc networks which uses an hybrid symmetric/ asymmetric scheme and the trust between users in order to exchange the initial data and to exchange the secret keys that will be used to encrypt the data. Trust is based on the first visual contact between users. Our proposal is a complete self-configured secure protocol that is able to create the network and share secure services without any infrastructure. The network allows sharing resources and offering new services among users in a secure environment. The protocol includes all functions needed to operate without any external support. We have designed and developed it in devices with limited resources. Network creation stages are detailed and the communication, protocol messages, and network management are explained. Our proposal has been implemented in order to test the protocol procedure and performance. Finally, we compare the protocol with other spontaneous ad hoc network protocols in order to highlight its features and we provide a security analysis of the system. Contact : 044-43555140 9344399918/26 Page 32
  33. 33. TYPICAL SOFT TECHNOLOGIES 13.Dynamic Resource Allocation Using Virtual Machines for Cloud Computing Environment Abstract : Cloud computing allows business customers to scale up and down their resource usage based on needs. Many of the touted gains in the cloud model come from resource multiplexing through virtualization technology. In this paper, we present a system that uses virtualization technology to allocate data center resources dynamically based on application demands and support green computing by optimizing the number of servers in use. We introduce the concept of ―skewness‖ to measure the unevenness in the multidimensional resource utilization of a server. By minimizing skewness, we can combine different types of workloads nicely and improve the overall utilization of server resources. We develop a set of heuristics that prevent overload in the system effectively while saving energy used. Trace driven simulation and experiment results demonstrate that our algorithm achieves good performance. 14.High Performance Resource Allocation Strategies for Computational Economies Abstract : Utility computing models have long been the focus of academic research, and with the recent success of commercial cloud providers, computation and storage is finally being realized as the fifth utility. Computational economies are often proposed as an efficient means of resource allocation, however adoption has been limited due to a lack of performance and high overheads. In this paper, we address the performance limitations of existing economic allocation models by defining strategies to reduce the failure and reallocation rate, increase occupancy and thereby increase the obtainable utilization of the system. The high-performance resource utilization strategies presented can be used by market participants without requiring dramatic changes to the allocation protocol. The strategies considered include overbooking, advanced reservation, justin-time bidding, and using substitute providers for service delivery. The proposed strategies have Contact : 044-43555140 9344399918/26 Page 33
  34. 34. TYPICAL SOFT TECHNOLOGIES been implemented in a distributed metascheduler and evaluated with respect to Grid and cloud deployments. Several diverse synthetic workloads have been used to quantity both the performance benefits and economic implications of these strategies. 15.A Privacy Leakage Upper Bound Constraint-Based Approach for CostEffective Privacy Preserving of Intermediate Data Sets in Cloud Abstract : Cloud computing provides massive computation power and storage capacity which enable users to deploy computation and data-intensive applications without infrastructure investment. Along the processing of such applications, a large volume of intermediate data sets will be generated, and often stored to save the cost of recomputing them. However, preserving the privacy of intermediate data sets becomes a challenging problem because adversaries may recover privacy-sensitive information by analyzing multiple intermediate data sets. Encrypting ALL data sets in cloud is widely adopted in existing approaches to address this challenge. But we argue that encrypting all intermediate data sets are neither efficient nor cost-effective because it is very time consuming and costly for data-intensive applications to en/decrypt data sets frequently while performing any operation on them. In this paper, we propose a novel upper bound privacy leakage constraint-based approach to identify which intermediate data sets need to be encrypted and which do not, so that privacy-preserving cost can be saved while the privacy requirements of data holders can still be satisfied. Evaluation results demonstrate that the privacy-preserving cost of intermediate data sets can be significantly reduced with our approach over existing ones where all data sets are encrypted. Contact : 044-43555140 9344399918/26 Page 34
  35. 35. TYPICAL SOFT TECHNOLOGIES 16.A Secure Payment Scheme with Low Communication and Processing Overhead for Multihop Wireless Networks Abstract : We propose RACE, a report-based payment scheme for multihop wireless networks to stimulate node cooperation, regulate packet transmission, and enforce fairness. The nodes submit lightweight payment reports (instead of receipts) to the accounting center (AC) and temporarily store undeniable security tokens called Evidences. The reports contain the alleged charges and rewards without security proofs, e.g., signatures. The AC can verify the payment by investigating the consistency of the reports, and clear the payment of the fair reports with almost no processing overhead or cryptographic operations. For cheating reports, the Evidences are requested to identify and evict the cheating nodes that submit incorrect reports. Instead of requesting the Evidences from all the nodes participating in the cheating reports, RACE can identify the cheating nodes with requesting few Evidences. Moreover, Evidence aggregation technique is used to reduce the Evidences’ storage area. Our analytical and simulation results demonstrate that RACE requires much less communication and processing overhead than the existing receiptbased schemes with acceptable payment clearance delay and storage area. This is essential for the effective implementation of a payment scheme because it uses micropayment and the overhead cost should be much less than the payment value. Moreover, RACE can secure the payment and precisely identify the cheating nodes without false accusations. 17.Mobi-Sync: Efficient Time Synchronization for Mobile Underwater Sensor Networks Abstract : Time synchronization is an important requirement for many services provided by distributed networks. A lot of time synchronization protocols have been proposed for terrestrial Wireless Sensor Networks (WSNs). However, none of them can be directly applied to Underwater Sensor Contact : 044-43555140 9344399918/26 Page 35
  36. 36. TYPICAL SOFT TECHNOLOGIES Networks (UWSNs). A synchronization algorithm forUWSNs must consider additional factors such as long propagation delays from the use of acoustic communication and sensor node mobility. These unique challenges make the accuracy of synchronization procedures for UWSNs even more critical. Time synchronization solutions specifically designed for UWSNs are needed to satisfy these new requirements. This paper proposes Mobi-Sync, a novel time synchronization scheme for mobile underwater sensor networks. Mobi-Sync distinguishes itself from previous approaches for terrestrial WSN by considering spatial correlation among the mobility patterns of neighboring UWSNs nodes. This enables Mobi-Sync to accurately estimate the long dynamic propagation delays. Simulation results show that Mobi-Sync outperforms existing schemes in both accuracy and energy efficiency. 18.Detection and Localization of Multiple Spoofing Attackers in Wireless Networks Abstract : Wireless spoofing attacks are easy to launch and can significantly impact the performance of networks. Although the identity of a node can be verified through cryptographic authentication, conventional security approaches are not always desirable because of their overhead requirements. In this paper, we propose to use spatial information, a physical property associated with each node, hard to falsify, and not reliant on cryptography, as the basis for 1) detecting spoofing attacks; 2) determining the number of attackers when multiple adversaries masquerading as the same node identity; and 3) localizing multiple adversaries. We propose to use the spatial correlation of received signal strength (RSS) inherited from wireless nodes to detect the spoofing attacks. We then formulate the problem of determining the number of attackers as a multiclass detection problem. Cluster-based mechanisms are developed to determine the number of attackers. When the training data are available, we explore using the Support Vector Machines (SVM) method to further improve the accuracy of determining the number of attackers. In addition, we developed an integrated detection and localization system Contact : 044-43555140 9344399918/26 Page 36
  37. 37. TYPICAL SOFT TECHNOLOGIES that can localize the positions of multiple attackers. We evaluated our techniques through two testbeds using both an 802.11 (WiFi) network and an 802.15.4 (ZigBee) network in two real office buildings. Our experimental results show that our proposed methods can achieve over 90 percent Hit Rate and Precision when determining the number of attackers. Our localization results using a representative set of algorithms provide strong evidence of high accuracy of localizing multiple adversaries. KNOWLEDGE AND DATA ENGINEERING 1. Crowdsourced Trace Similarity with Smartphones Abstract : Smartphones are nowadays equipped with a number of sensors, such as WiFi, GPS, accelerometers, etc. This capability allows smartphone users to easily engage in crowdsourced computing services, which contribute to the solution of complex problems in a distributed manner. In this work, we leverage such a computing paradigm to solve efficiently the following problem: comparing a query trace Q against a crowd of traces generated and stored on distributed smartphones. Our proposed framework, coined SmartTraceþ, provides an effective solution without disclosing any part of the crowd traces to the query processor. SmartTraceþ, relies on an in-situ data storage model and intelligent top-K query processing algorithms that exploit distributed trajectory similarity measures, resilient to spatial and temporal noise, in order to derive the most relevant answers to Q. We evaluate our algorithms on both synthetic and real workloads. We describe our prototype system developed on the Android OS. The solution is deployed over our own SmartLab testbed of 25 smartphones. Our study reveals that computations over SmartTraceþ result in substantial energy conservation; in addition, results can be computed faster than competitive approaches. Contact : 044-43555140 9344399918/26 Page 37
  38. 38. TYPICAL SOFT TECHNOLOGIES 2. Incentive Compatible Privacy-Preserving Data Analysis Abstract : In many cases, competing parties who have private data may collaboratively conduct privacy-preserving distributed data analysis (PPDA) tasks to learn beneficial data models or analysis results. Most often, the competing parties have different incentives. Although certain PPDA techniques guarantee that nothing other than the final analysis result is revealed, it is impossible to verify whether participating parties are truthful about their private input data. Unless proper incentives are set, current PPDA techniques cannot prevent participating parties from modifying their private inputs. This raises the question of how to design incentive compatible privacy-preserving data analysis techniques that motivate participating parties to provide truthful inputs. In this paper, we first develop key theorems, then base on these theorems, we analyze certain important privacy-preserving data analysis tasks that could be conducted in a way that telling the truth is the best choice for any participating party. 3. On Identifying Critical Nuggets of Information during Classification Tasks Abstract : In large databases, there may exist critical nuggets—small collections of records or instances that contain domain-specific important information. This information can be used for future decision making such as labeling of critical, unlabeled data records and improving classification results by reducing false positive and false negative errors. This work introduces the idea of critical nuggets, proposes an innovative domain-independent method to measure criticality, suggests a heuristic to reduce the search space for finding critical nuggets, and isolates and validates critical nuggets from some real-world data sets. It seems that only a few subsets may qualify to be critical nuggets, underlying the importance of finding them. The proposed methodology can detect them. This work also identifies certain properties of critical nuggets and provides experimental validation of the properties. Experimental results also helped validate that critical nuggets can assist in improving classification accuracies in real-world data sets. Contact : 044-43555140 9344399918/26 Page 38
  39. 39. TYPICAL SOFT TECHNOLOGIES 4. Failure-Aware Cascaded Suppression in Wireless Sensor Networks Abstract : Wireless sensor networks are widely used to continuously collect data from the environment. Because of energy constraints on battery-powered nodes, it is critical to minimize communication. Suppression has been proposed as a way to reduce communication by using predictive models to suppress reporting of predictable data. However, in the presence of communication failures, missing data are difficult to interpret because these could have been either suppressed or lost in transmission. There is no existing solution for handling failures for general, spatiotemporal suppression that uses cascading. While cascading further reduces communication, it makes failure handling difficult, because nodes can act on incomplete or incorrect information and in turn affect other nodes. We propose a cascaded suppression framework that exploits both temporal and spatial data correlation to reduce communication, and applies coding theory and Bayesian inference to recover missing data resulted from suppression and communication failures. Experiment results show that cascaded suppression significantly reduces communication cost and improves missing data recovery compared to existing approaches. 5. Optimal Route Queries with Arbitrary Order Constraints Abstract : Given a set of spatial points DS, each of which is associated with categorical information, e.g., restaurant, pub, etc., the optimal route query finds the shortest path that starts from the query point (e.g., a home or hotel), and covers a user-specified set of categories (e.g., {pub, restaurant, museum}). The user may also specify partial order constraints between different categories, e.g., a restaurant must be visited before a pub. Previous work has focused on a special case where the query contains the total order of all categories to be visited (e.g., museum ! restaurant ! pub). For the general scenario without such a total order, the only known solution Contact : 044-43555140 9344399918/26 Page 39
  40. 40. TYPICAL SOFT TECHNOLOGIES reduces the problem to multiple, total-order optimal route queries. As we show in this paper, this naı¨ve approach incurs a significant amount of repeated computations, and, thus, is not scalable to large data sets. Motivated by this, we propose novel solutions to the general optimal route query, based on two different methodologies, namely backward search and forward search. In addition, we discuss how the proposed methods can be adapted to answer a variant of the optimal route queries, in which the route only needs to cover a subset of the given categories. Extensive experiments, using both real and synthetic data sets, confirm that the proposed solutions are efficient and practical, and outperform existing methods by large margins. 6. Co-Occurrence-Based Diffusion for Expert Search on the Web Abstract : Expert search has been studied in different contexts, e.g., enterprises, academic communities. We examine a general expert search problem: searching experts on the web, where millions of webpages and thousands of names are considered. It has mainly two challenging issues: 1) webpages could be of varying quality and full of noises; 2) The expertise evidences scattered in webpages are usually vague and ambiguous. We propose to leverage the large amount of cooccurrence information to assess relevance and reputation of a person name for a query topic. The co-occurrence structure is modeled using a hypergraph, on which a heat diffusion based ranking algorithm is proposed. Query keywords are regarded as heat sources, and a person name which has strong connection with the query (i.e., frequently co-occur with query keywords and co-occur with other names related to query keywords) will receive most of the heat, thus being ranked high. Experiments on the ClueWeb09 web collection show that our algorithm is effective for retrieving experts and outperforms baseline algorithms significantly. This work would be regarded as one step toward addressing the more general entity search problem without sophisticated NLP techniques. Contact : 044-43555140 9344399918/26 Page 40
  41. 41. TYPICAL SOFT TECHNOLOGIES 7. Clustering Uncertain Data Based on Probability Distribution Similarity Abstract : Clustering on uncertain data, one of the essential tasks in mining uncertain data, posts significant challenges on both modeling similarity between uncertain objects and developing efficient computational methods. The previous methods extend traditional partitioning clustering methods like k-means and density-based clustering methods like DBSCAN to uncertain data, thus rely on geometric distances between objects. Such methods cannot handle uncertain objects that are geometrically indistinguishable, such as products with the same mean but very different variances in customer ratings. Surprisingly, probability distributions, which are essential characteristics of uncertain objects, have not been considered in measuring similarity between uncertain objects. In this paper, we systematically model uncertain objects in both continuous and discrete domains, where an uncertain object is modeled as a continuous and discrete random variable, respectively. We use the well-known Kullback-Leibler divergence to measure similarity between uncertain objects in both the continuous and discrete cases, and integrate it into partitioning and density-based clustering methods to cluster uncertain objects. Nevertheless, a naı¨ve implementation is very costly. Particularly, computing exact KL divergence in the continuous case is very costly or even infeasible. To tackle the problem, we estimate KL divergence in the continuous case by kernel density estimation and employ the fast Gauss transform technique to further speed up the computation. Our extensive experiment results verify the effectiveness, efficiency, and scalability of our approaches. 8. PMSE: A Personalized Mobile Search Engine Abstract : We propose a personalized mobile search engine (PMSE) that captures the users' preferences in the form of concepts by mining their clickthrough data. Due to the importance of location information in mobile search, PMSE classifies these concepts into content concepts and location concepts. In addition, users' locations (positioned by GPS) are used to supplement the location Contact : 044-43555140 9344399918/26 Page 41
  42. 42. TYPICAL SOFT TECHNOLOGIES concepts in PMSE. The user preferences are organized in an ontology-based, multifacet user profile, which are used to adapt a personalized ranking function for rank adaptation of future search results. To characterize the diversity of the concepts associated with a query and their relevances to the user's need, four entropies are introduced to balance the weights between the content and location facets. Based on the client-server model, we also present a detailed architecture and design for implementation of PMSE. In our design, the client collects and stores locally the clickthrough data to protect privacy, whereas heavy tasks such as concept extraction, training, and reranking are performed at the PMSE server. Moreover, we address the privacy issue by restricting the information in the user profile exposed to the PMSE server with two privacy parameters. We prototype PMSE on the Google Android platform. Experimental results show that PMSE significantly improves the precision comparing to the baseline. 9. Discovering Temporal Change Patterns in the Presence of Taxonomies Abstract : Frequent itemset mining is a widely exploratory technique that focuses on discovering recurrent correlations among data. The steadfast evolution of markets and business environments prompts the need of data mining algorithms to discover significant correlation changes in order to reactively suit product and service provision to customer needs. Change mining, in the context of frequent itemsets, focuses on detecting and reporting significant changes in the set of mined itemsets from one time period to another. The discovery of frequent generalized itemsets, i.e., itemsets that 1) frequently occur in the source data, and 2) provide a high-level abstraction of the mined knowledge, issues new challenges in the analysis of itemsets that become rare, and thus are no longer extracted, from a certain point. This paper proposes a novel kind of dynamic pattern, namely the HIstory GENeralized Pattern (HIGEN), that represents the evolution of an itemset in consecutive time periods, by reporting the information about its frequent generalizations characterized by minimal redundancy (i.e., minimum level of abstraction) in case it becomes infrequent in a certain time period. To address HIGEN mining, it proposes HIGEN MINER, an algorithm that focuses on avoiding itemset mining followed by postprocessing by exploiting a support-driven itemset generalization approach. To focus the attention on the Contact : 044-43555140 9344399918/26 Page 42
  43. 43. TYPICAL SOFT TECHNOLOGIES minimally redundant frequent generalizations and thus reduce the amount of the generated patterns, the discovery of a smart subset of HIGENs, namely the NONREDUNDANT HIGENs, is addressed as well. Experiments performed on both real and synthetic datasets show the efficiency and the effectiveness of the proposed approach as well as its usefulness in a real application context. 10.Spatial Approximate String Search Abstract : This work deals with the approximate string search in large spatial databases. Specifically, we investigate range queries augmented with a string similarity search predicate in both euclidean space and road networks. We dub this query the spatial approximate string (SAS) query. In euclidean space, we propose an approximate solution, the MHR-tree, which embeds min-wise signatures into an R-tree. The min-wise signature for an index node u keeps a concise representation of the union of q-grams from strings under the subtree of u. We analyze the pruning functionality of such signatures based on the set resemblance between the query string and the q-grams from the subtrees of index nodes. We also discuss how to estimate the selectivity of a SAS query in euclidean space, for which we present a novel adaptive algorithm to find balanced partitions using both the spatial and string information stored in the tree. For queries on road networks, we propose a novel exact method, RSASSOL, which significantly outperforms the baseline algorithm in practice. The RSASSOL combines the q-gram-based inverted lists and the reference nodes based pruning. Extensive experiments on large real data sets demonstrate the efficiency and effectiveness of our approaches. Contact : 044-43555140 9344399918/26 Page 43
  44. 44. TYPICAL SOFT TECHNOLOGIES 11.Robust Module-Based Data Management Abstract : The current trend for building an ontology-based data management system (DMS) is to capitalize on efforts made to design a preexisting well-established DMS (a reference system). The method amounts to extracting from the reference DMS a piece of schema relevant to the new application needs—a module—, possibly personalizing it with extra constraints w.r.t. the application under construction, and then managing a data set using the resulting schema. In this paper, we extend the existing definitions of modules and we introduce novel properties of robustness that provide means for checking easily that a robust module-based DMS evolves safely w.r.t. both the schema and the data of the reference DMS. We carry out our investigations in the setting of description logics which underlie modern ontology languages, like RDFS, OWL, and OWL2 from W3C. Notably, we focus on the DL-liteA dialect of the DL-lite family, which encompasses the foundations of the QL profile of OWL2 (i.e., DL-liteR): the W3C recommendation for efficiently managing large data sets. 12.Protecting Sensitive Labels in Social Network Data Anonymization Abstract : Privacy is one of the major concerns when publishing or sharing social network data for social science research and business analysis. Recently, researchers have developed privacy models similar to k-anonymity to prevent node reidentification through structure information. However, even when these privacy models are enforced, an attacker may still be able to infer one’s private information if a group of nodes largely share the same sensitive labels (i.e., attributes). In other words, the label-node relationship is not well protected by pure structure Contact : 044-43555140 9344399918/26 Page 44
  45. 45. TYPICAL SOFT TECHNOLOGIES anonymization methods. Furthermore, existing approaches, which rely on edge editing or node clustering, may significantly alter key graph properties. In this paper, we define a k-degree-ldiversity anonymity model that considers the protection of structural information as well as sensitive labels of individuals. We further propose a novel anonymization methodology based on adding noise nodes. We develop a new algorithm by adding noise nodes into the original graph with the consideration of introducing the least distortion to graph properties. Most importantly, we provide a rigorous analysis of the theoretical bounds on the number of noise nodes added and their impacts on an important graph property. We conduct extensive experiments to evaluate the effectiveness of the proposed technique. 13.A Proxy-Based Approach to Continuous Location-Based Spatial Queries in Mobile Environments Abstract : Caching valid regions of spatial queries at mobile clients is effective in reducing the number of queries submitted by mobile clients and query load on the server. However, mobile clients suffer from longer waiting time for the server to compute valid regions. We propose in this paper a proxy-based approach to continuous nearest-neighbor (NN) and window queries. The proxy creates estimated valid regions (EVRs) for mobile clients by exploiting spatial and temporal locality of spatial queries. For NN queries, we devise two new algorithms to accelerate EVR growth, leading the proxy to build effective EVRs even when the cache size is small. On the other hand, we propose to represent the EVRs of window queries in the form of vectors, called estimated window vectors (EWVs), to achieve larger estimated valid regions. This novel representation and the associated creation algorithm result in more effective EVRs of window queries. In addition, due to the distinct characteristics, we use separate index structures, namely EVR-tree and grid index, for NN queries and window queries, respectively. To further increase efficiency, we develop algorithms to exploit the results of NN queries to aid grid index growth, Contact : 044-43555140 9344399918/26 Page 45
  46. 46. TYPICAL SOFT TECHNOLOGIES benefiting EWV creation of window queries. Similarly, the grid index is utilized to support NN query answering and EVR updating. We conduct several experiments for performance evaluation. The experimental results show that the proposed approach significantly outperforms the existing proxy-based approaches. 14.A Fast Clustering-Based Feature Subset Selection Algorithm for HighDimensional Data Abstract : Feature selection involves identifying a subset of the most useful features that produces compatible results as the original entire set of features. A feature selection algorithm may be evaluated from both the efficiency and effectiveness points of view. While the efficiency concerns the time required to find a subset of features, the effectiveness is related to the quality of the subset of features. Based on these criteria, a fast clustering-based feature selection algorithm (FAST) is proposed and experimentally evaluated in this paper. The FAST algorithm works in two steps. In the first step, features are divided into clusters by using graph-theoretic clustering methods. In the second step, the most representative feature that is strongly related to target classes is selected from each cluster to form a subset of features. Features in different clusters are relatively independent, the clustering-based strategy of FAST has a high probability of producing a subset of useful and independent features. To ensure the efficiency of FAST, we adopt the efficient minimum-spanning tree (MST) clustering method. The efficiency and effectiveness of the FAST algorithm are evaluated through an empirical study. Extensive experiments are carried out to compare FAST and several representative feature selection algorithms, namely, FCBF, ReliefF, CFS, Consist, and FOCUS-SF, with respect to four types of well-known classifiers, namely, the probabilitybased Naive Bayes, the tree-based C4.5, the instance-based IB1, and the rule-based RIPPER before and after feature selection. The results, on 35 publicly available real-world high-dimensional image, Contact : 044-43555140 9344399918/26 Page 46
  47. 47. TYPICAL SOFT TECHNOLOGIES microarray, and text data, demonstrate that the FAST not only produces smaller subsets of features but also improves the performances of the four types of classifiers. 15.Ranking on Data Manifold with Sink Points Abstract : Ranking is an important problem in various applications, such as Information Retrieval (IR), natural language processing, computational biology, and social sciences. Many ranking approaches have been proposed to rank objects according to their degrees of relevance or importance. Beyond these two goals, diversity has also been recognized as a crucial criterion in ranking. Top ranked results are expected to convey as little redundant information as possible, and cover as many aspects as possible. However, existing ranking approaches either take no account of diversity, or handle it separately with some heuristics. In this paper, we introduce a novel approach, Manifold Ranking with Sink Points (MRSPs), to address diversity as well as relevance and importance in ranking. Specifically, our approach uses a manifold ranking process over the data manifold, which can naturally find the most relevant and important data objects. Meanwhile, by turning ranked objects into sink points on data manifold, we can effectively prevent redundant objects from receiving a high rank. MRSP not only shows a nice convergence property, but also has an interesting and satisfying optimization explanation. We applied MRSP on two application tasks, update summarization and query recommendation, where diversity is of great concern in ranking. Experimental results on both tasks present a strong empirical performance of MRSP as compared to existing ranking approaches. Contact : 044-43555140 9344399918/26 Page 47
  48. 48. TYPICAL SOFT TECHNOLOGIES 16.Tweet Analysis for Real-Time Event Detection and Earthquake Reporting System Development Abstract : Twitter has received much attention recently. An important characteristic of Twitter is its real-time nature. We investigate the real-time interaction of events such as earthquakes in Twitter and propose an algorithm to monitor tweets and to detect a target event. To detect a target event, we devise a classifier of tweets based on features such as the keywords in a tweet, the number of words, and their context. Subsequently, we produce a probabilistic spatiotemporal model for the target event that can find the center of the event location. We regard each Twitter user as a sensor and apply particle filtering, which are widely used for location estimation. The particle filter works better than other comparable methods for estimating the locations of target events. As an application, we develop an earthquake reporting system for use in Japan. Because of the numerous earthquakes and the large number of Twitter users throughout the country, we can detect an earthquake with high probability (93 percent of earthquakes of Japan Meteorological Agency (JMA) seismic intensity scale 3 or more are detected) merely by monitoring tweets. Our system detects earthquakes promptly and notification is delivered much faster than JMA broadcast announcements. 17.Clustering Sentence-Level Text Using a Novel Fuzzy Relational Clustering Algorithm Abstract : In comparison with hard clustering methods, in which a pattern belongs to a single cluster, fuzzy clustering algorithms allow patterns to belong to all clusters with differing degrees of membership. This is important in domains such as sentence clustering, since a sentence is likely to be related to more than one theme or topic present within a document or set of documents. Contact : 044-43555140 9344399918/26 Page 48
  49. 49. TYPICAL SOFT TECHNOLOGIES However, because most sentence similarity measures do not represent sentences in a common metric space, conventional fuzzy clustering approaches based on prototypes or mixtures of Gaussians are generally not applicable to sentence clustering. This paper presents a novel fuzzy clustering algorithm that operates on relational input data; i.e., data in the form of a square matrix of pairwise similarities between data objects. The algorithm uses a graph representation of the data, and operates in an Expectation-Maximization framework in which the graph centrality of an object in the graph is interpreted as a likelihood. Results of applying the algorithm to sentence clustering tasks demonstrate that the algorithm is capable of identifying overlapping clusters of semantically related sentences, and that it is therefore of potential use in a variety of text mining tasks. We also include results of applying the algorithm to benchmark data sets in several other domains. 18.Distributed Processing of Probabilistic Top-k Queries in Wireless Sensor Networks Abstract : In this paper, we introduce the notion of sufficient set and necessary set for distributed processing of probabilistic top-k queries in cluster-based wireless sensor networks. These two concepts have very nice properties that can facilitate localized data pruning in clusters. Accordingly, we develop a suite of algorithms, namely, sufficient set-based (SSB), necessary setbased (NSB), and boundary-based (BB), for intercluster query processing with bounded rounds of communications. Moreover, in responding to dynamic changes of data distribution in the network, we develop an adaptive algorithm that dynamically switches among the three proposed algorithms to minimize the transmission cost. We show the applicability of sufficient set and necessary set to wireless sensor networks with both two-tier hierarchical and tree-structured network topologies. Experimental results show that the proposed algorithms reduce data transmissions significantly and incur only small constant rounds of data communications. The Contact : 044-43555140 9344399918/26 Page 49