The emerging widespread use of Peer-to-Peer computing is making the P2P Data Mining a
natural choice when data sets are distributed over such kind of systems. The huge amount of
data stored within the nodes of P2P networks and the bigger and bigger number of applications
dealing with them as p2p file-sharing, p2p chatting, p2p electronic commerce etc.., is moving
the spotlight on this challenging field. In this paper we give an overview of two different
approaches for implementing primitives for P2P Data Mining, trying then to show differences
and similarities. The first one is based on the definition of Local algorithms; the second one
relies on the Newscast model of computation.
Analytical Modelling of Localized P2P Streaming Systems under NAT ConsiderationIJCNCJournal
This document summarizes an analytical model for localized peer-to-peer (P2P) streaming systems that considers the impact of network address translation (NAT). It introduces theoretical boundaries for the number of peers that may be expelled from the system due to NAT incompatibility. It also presents a mathematical model for startup delay in P2P streaming that accounts for peers' NAT types. The document proposes a new neighbor selection algorithm that considers both autonomous system numbers and NAT types to improve connectivity while reducing transit traffic and startup delays.
Scale-Free Networks to Search in Unstructured Peer-To-Peer NetworksIOSR Journals
This document discusses using scale-free networks to improve search efficiency in unstructured peer-to-peer networks. It proposes the EQUATOR architecture, which creates an overlay network topology based on the scale-free Barabasi-Albert model. Simulation results show that EQUATOR achieves good lookup performance comparable to the ideal Barabasi-Albert network, with low message overhead even under node churn. The scale-free topology allows random walks to efficiently locate resources by directing searches to high-degree "hub" nodes with greater knowledge of the network.
A Distributed Approach to Solving Overlay Mismatching ProblemZhenyun Zhuang
This document proposes an algorithm called Adaptive Connection Establishment (ACE) to address the topology mismatch problem between the logical overlay network and physical underlying network in unstructured peer-to-peer systems. ACE builds a minimum spanning tree among each source node and its neighbors within a certain diameter, optimizes connections not on the tree to reduce redundant traffic, while retaining search scope. It evaluates tradeoffs between topology optimization and information exchange overhead by changing the diameter. Simulation results show ACE can significantly reduce unnecessary P2P traffic by efficiently matching the overlay and physical network topologies.
Hybrid Periodical Flooding in Unstructured Peer-to-Peer NetworksZhenyun Zhuang
This document proposes a new search mechanism called Hybrid Periodical Flooding (HPF) for unstructured peer-to-peer networks. HPF aims to reduce unnecessary traffic like blind flooding while also addressing the "partial coverage problem" of some statistics-based search mechanisms. It introduces the concept of Periodical Flooding (PF), which controls the number of neighbors a query is forwarded to based on the time-to-live value. This allows the forwarding behavior to change periodically over the query's lifetime. HPF then combines PF with weighted selection of neighbors based on multiple metrics to guide queries towards potentially relevant results while exploring more of the network.
The document presents a compartmental model for characterizing the spread of malware in peer-to-peer (P2P) networks like Gnutella. The model partitions peers into compartments based on their state - those wishing to download (S), currently downloading (E), having downloaded (I), and no longer interested (R). Differential equations track changes between compartments over time. Simulation results show the model effectively captures the impact of parameters like peer online/offline switching rates and quarantine strategies on malware intensity. The model improves on prior work by incorporating user behavior dynamics and limiting malware spread to a node's time-to-live range.
UTILIZING XAI TECHNIQUE TO IMPROVE AUTOENCODER BASED MODEL FOR COMPUTER NETWO...IJCNCJournal
Machine learning (ML) and Deep Learning (DL) methods are being adopted rapidly, especially in computer network security, such as fraud detection, network anomaly detection, intrusion detection, and much more. However, the lack of transparency of ML and DL based models is a major obstacle to their implementation and criticized due to its black-box nature, even with such tremendous results. Explainable Artificial Intelligence (XAI) is a promising area that can improve the trustworthiness of these models by giving explanations and interpreting its output. If the internal working of the ML and DL based models is understandable, then it can further help to improve its performance. The objective of this paper is to show that how XAI can be used to interpret the results of the DL model, the autoencoder in this case. And, based on the interpretation, we improved its performance for computer network anomaly detection. The kernel SHAP method, which is based on the shapley values, is used as a novel feature selection technique. This method is used to identify only those features that are actually causing the anomalous behaviour of the set of attack/anomaly instances. Later, these feature sets are used to train and validate the autoencoderbut on benign data only. Finally, the built SHAP_Model outperformed the other two models proposed based on the feature selection method. This whole experiment is conducted on the subset of the latest CICIDS2017 network dataset. The overall accuracy and AUC of SHAP_Model is 94% and 0.969, respectively.
FUZZY LOGIC-BASED EFFICIENT MESSAGE ROUTE SELECTION METHOD TO PROLONG THE NET...IJCNCJournal
- The document discusses a fuzzy logic-based method for efficient message routing in wireless sensor networks to prolong the network lifetime. It aims to balance energy load across nodes by selectively tagging nodes at risk of energy exhaustion and rerouting messages around them.
- It proposes using fuzzy logic to evaluate nodes based on their potential importance, energy level, and event occurrence frequency to determine tagging. Tagged nodes avoid routing traffic but still detect and generate reports.
- The method was tested by applying it to a probabilistic voting-based filtering security scheme and was shown to improve energy efficiency, node survival rate, and report transmission success compared to not tagging nodes.
Analytical Modelling of Localized P2P Streaming Systems under NAT ConsiderationIJCNCJournal
This document summarizes an analytical model for localized peer-to-peer (P2P) streaming systems that considers the impact of network address translation (NAT). It introduces theoretical boundaries for the number of peers that may be expelled from the system due to NAT incompatibility. It also presents a mathematical model for startup delay in P2P streaming that accounts for peers' NAT types. The document proposes a new neighbor selection algorithm that considers both autonomous system numbers and NAT types to improve connectivity while reducing transit traffic and startup delays.
Scale-Free Networks to Search in Unstructured Peer-To-Peer NetworksIOSR Journals
This document discusses using scale-free networks to improve search efficiency in unstructured peer-to-peer networks. It proposes the EQUATOR architecture, which creates an overlay network topology based on the scale-free Barabasi-Albert model. Simulation results show that EQUATOR achieves good lookup performance comparable to the ideal Barabasi-Albert network, with low message overhead even under node churn. The scale-free topology allows random walks to efficiently locate resources by directing searches to high-degree "hub" nodes with greater knowledge of the network.
A Distributed Approach to Solving Overlay Mismatching ProblemZhenyun Zhuang
This document proposes an algorithm called Adaptive Connection Establishment (ACE) to address the topology mismatch problem between the logical overlay network and physical underlying network in unstructured peer-to-peer systems. ACE builds a minimum spanning tree among each source node and its neighbors within a certain diameter, optimizes connections not on the tree to reduce redundant traffic, while retaining search scope. It evaluates tradeoffs between topology optimization and information exchange overhead by changing the diameter. Simulation results show ACE can significantly reduce unnecessary P2P traffic by efficiently matching the overlay and physical network topologies.
Hybrid Periodical Flooding in Unstructured Peer-to-Peer NetworksZhenyun Zhuang
This document proposes a new search mechanism called Hybrid Periodical Flooding (HPF) for unstructured peer-to-peer networks. HPF aims to reduce unnecessary traffic like blind flooding while also addressing the "partial coverage problem" of some statistics-based search mechanisms. It introduces the concept of Periodical Flooding (PF), which controls the number of neighbors a query is forwarded to based on the time-to-live value. This allows the forwarding behavior to change periodically over the query's lifetime. HPF then combines PF with weighted selection of neighbors based on multiple metrics to guide queries towards potentially relevant results while exploring more of the network.
The document presents a compartmental model for characterizing the spread of malware in peer-to-peer (P2P) networks like Gnutella. The model partitions peers into compartments based on their state - those wishing to download (S), currently downloading (E), having downloaded (I), and no longer interested (R). Differential equations track changes between compartments over time. Simulation results show the model effectively captures the impact of parameters like peer online/offline switching rates and quarantine strategies on malware intensity. The model improves on prior work by incorporating user behavior dynamics and limiting malware spread to a node's time-to-live range.
UTILIZING XAI TECHNIQUE TO IMPROVE AUTOENCODER BASED MODEL FOR COMPUTER NETWO...IJCNCJournal
Machine learning (ML) and Deep Learning (DL) methods are being adopted rapidly, especially in computer network security, such as fraud detection, network anomaly detection, intrusion detection, and much more. However, the lack of transparency of ML and DL based models is a major obstacle to their implementation and criticized due to its black-box nature, even with such tremendous results. Explainable Artificial Intelligence (XAI) is a promising area that can improve the trustworthiness of these models by giving explanations and interpreting its output. If the internal working of the ML and DL based models is understandable, then it can further help to improve its performance. The objective of this paper is to show that how XAI can be used to interpret the results of the DL model, the autoencoder in this case. And, based on the interpretation, we improved its performance for computer network anomaly detection. The kernel SHAP method, which is based on the shapley values, is used as a novel feature selection technique. This method is used to identify only those features that are actually causing the anomalous behaviour of the set of attack/anomaly instances. Later, these feature sets are used to train and validate the autoencoderbut on benign data only. Finally, the built SHAP_Model outperformed the other two models proposed based on the feature selection method. This whole experiment is conducted on the subset of the latest CICIDS2017 network dataset. The overall accuracy and AUC of SHAP_Model is 94% and 0.969, respectively.
FUZZY LOGIC-BASED EFFICIENT MESSAGE ROUTE SELECTION METHOD TO PROLONG THE NET...IJCNCJournal
- The document discusses a fuzzy logic-based method for efficient message routing in wireless sensor networks to prolong the network lifetime. It aims to balance energy load across nodes by selectively tagging nodes at risk of energy exhaustion and rerouting messages around them.
- It proposes using fuzzy logic to evaluate nodes based on their potential importance, energy level, and event occurrence frequency to determine tagging. Tagged nodes avoid routing traffic but still detect and generate reports.
- The method was tested by applying it to a probabilistic voting-based filtering security scheme and was shown to improve energy efficiency, node survival rate, and report transmission success compared to not tagging nodes.
Privacy Preserving Reputation Calculation in P2P Systems with Homomorphic Enc...IJCNCJournal
This document discusses a method for privacy-preserving reputation calculation in peer-to-peer systems using homomorphic encryption. Specifically, it proposes:
1) Extending the EigenTrust reputation system to calculate node reputations in a distributed manner while preserving evaluator privacy. It does this by successively updating encrypted reputation values through calculation to reflect trust values without disclosing the original values.
2) Improving calculation efficiency by offloading parts of the task to participating nodes and using different public keys during calculation to improve robustness against node churn.
3) Evaluating the performance of the proposed method, finding it reduces maximum circulation time for aggregating multiplication results by half, reducing computation time per round. The privacy preservation cost scales
MEKDA: Multi-Level ECC based Key Distribution and Authentication in Internet ...IJCNCJournal
The Internet of Things (IoT) is an extensive system of networks and connected devices with minimal human interaction and swift growth. The constraints of the System and limitations of Devices pose several challenges, including security; hence billions of devices must protect from attacks and compromises. The resource-constrained nature of IoT devices amplifies security challenges. Thus standard data communication and security measures are inefficient in the IoT environment. The ubiquity of IoT devices and their deployment in sensitive applications increase the vulnerability of any security breaches to risk lives. Hence, IoT-related security challenges are of great concern. Authentication is the solution to the vulnerability of a malicious device in the IoT environment. The proposed Multi-level Elliptic Curve Cryptography based Key Distribution and Authentication in IoT enhances the security by Multi-level Authentication when the devices enter or exit the Cluster in an IoT system. The decreased Computation Time and Energy Consumption by generating and distributing Keys using Elliptic Curve Cryptography extends the availability of the IoT devices. The Performance analysis shows the improvement over the Fast Authentication and Data Transfer method.
Ontology-Based Routing for Large-Scale Unstructured P2P Publish/Subscribe Systemtheijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
EFFECTIVE TOPOLOGY-AWARE PEER SELECTION IN UNSTRUCTURED PEER-TO-PEER SYSTEMSijp2p
Peer-to-Peer systems form logical overlay networks on top of the Internet. Essentially, peers randomly
choose logical neighbours without any knowledge about underlying physical topology. This may cause
inefficient communications among peers. This topology mismatch problem may result in poor
performance and scalability for Peer-to-Peer systems. A possible way to improve the performance of
Peer-to-Peer systems is the overlay network construction based on the knowledge of the physical network
topology. In this paper, we will propose the use of the “Record Route” and “Timestamp” options
supported in the IP protocol to explore the paths between peers. By the topology-aware peer selection,
our approach outperforms traditional P2P systems using random peer selection. Our approach only
incurs a low overhead and can be deployed easily in various P2P systems.
PUBLIC INTEGRIYT AUDITING FOR SHARED DYNAMIC DATA STORAGE UNDER ONTIME GENERA...paperpublications3
Abstract: Nowadays verifying the result of the remote computation plays a crucial role in addressing in issue of trust. The outsourced data collection comes for multiple data sources to diagnose the originator of errors by allotting each data sources a unique secrete key which requires the inner product conformation to be performed under any two parties different keys. The proposed methods outperform AISM technique to minimize the running time. The multi-key setting is given different secrete keys, multiple data sources can be upload their data streams along with their respective verifiable homomorphic tag. The AISM consist of three novel join techniques depending on the ADS availability: (i) Authenticated Indexed Sort Merge Join (AISM), which utilizes a single ADS on the join attribute, (ii) Authenticated Index Merge Join (AIM) that requires an ADS (on the join attribute) for both relations, and (iii) Authenticated Sort Merge Join (ASM), which does not rely on any ADS. The client should allow choosing any portion in the data streams for queries. The communication between the client and server is independent of input size. The inner product evaluation can be performed by any two sources and the result can be verified by using the particular tag.
Keywords: Computation of outsourcing, Data Stream, Multiple Key, Homomorphic encryption.
Title: PUBLIC INTEGRIYT AUDITING FOR SHARED DYNAMIC DATA STORAGE UNDER ONTIME GENERATED MULTIPLE KEYS
Author: C. NISHA MALAR, M. S. BONSHIA BINU
ISSN 2350-1049
International Journal of Recent Research in Interdisciplinary Sciences (IJRRIS)
Paper Publications
The peer-reviewed International Journal of Engineering Inventions (IJEI) is started with a mission to encourage contribution to research in Science and Technology. Encourage and motivate researchers in challenging areas of Sciences and Technology.
SECURITY CONSIDERATION IN PEER-TO-PEER NETWORKS WITH A CASE STUDY APPLICATIONIJNSA Journal
Peer-to-Peer (P2P) overlay networks wide adoption has also created vast dangers due to the millions of users who are not conversant with the potential security risks. Lack of centralized control creates great risks to the P2P systems. This is mainly due to the inability to implement proper authentication approaches for threat management. The best possible solutions, however, include encryption, utilization of administration, implementing cryptographic protocols, avoiding personal file sharing, and unauthorized downloads. Recently a new non-DHT based structured P2P system is very suitable for designing secured communication protocols. This approach is based on Linear Diophantine Equation (LDE) [1]. The P2P architectures based on this protocol offer simplified methods to integrate symmetric and asymmetric cryptographies’ solutions into the P2P architecture with no need of utilizing Transport Layer Security (TLS), and its predecessor, Secure Sockets Layer (SSL) protocols.
A COOPERATIVE LOCALIZATION METHOD BASED ON V2I COMMUNICATION AND DISTANCE INF...IJCNCJournal
Relative positions are recent solutions to overcome the limited accuracy of GPS in urban environment.
Vehicle positions obtained using V2I communication are more accurate because the known roadside unit
(RSU) locations help predict errors in measurements over time. The accuracy of vehicle positions depends
more on the number of RSUs; however, the high installation cost limits the use of this approach. It also
depends on nonlinear localization nature. They were neglected in several research papers. In these studies,
the accumulated errors increased with time due to the linearity localization problem. In the present study,
a cooperative localization method based on V2I communication and distance information in vehicular
networks is proposed for improving the estimates of vehicles’ initial positions. This method assumes that
the virtual RSUs based on mobility measurements help reduce installation costs and facilitate in handling
fault environments. The extended Kalman filter algorithm is a well-known estimator in nonlinear problem,
but it requires well initial vehicle position vector and adaptive noise in measurements. Using the proposed
method, vehicles’ initial positions can be estimated accurately. The experimental results confirm that the
proposed method has superior accuracy than existing methods, giving a root mean square error of
approximately 1 m. In addition, it is shown that virtual RSUs can assist in estimating initial positions in
fault environments.
Mobile Hosts Participating in Peer-to-Peer Data Networks: Challenges and Solu...Zhenyun Zhuang
Wireless Networks (2010)
http://dl.acm.org/citation.cfm?id=1873504
Peer-to-peer (P2P) data networks dominate
Internet traffic, accounting for over 60% of the overall
traffic in a recent study. In this work, we study the
problems that arise when mobile hosts participate in
P2P networks. We primarily focus on the performance
issues as experienced by the mobile host, but also study
the impact on other fixed peers. Using BitTorrent as a
key example, we identify several unique problems that
arise due to the design aspects of P2P networks being
incompatible with typical characteristics of wireless
and mobile environments. Using the insights gained
through our study, we present a wireless P2P (wP2P)
client application that is backward compatible with existing
fixed-peer client applications, but when used on
mobile hosts can provide significant performance improvements.
P2P DOMAIN CLASSIFICATION USING DECISION TREE ijp2p
The increasing interest in Peer-to-Peer systems (such as Gnutella) has inspired many research activities
in this area. Although many demonstrations have been performed that show that the performance of a
Peer-to-Peer system is highly dependent on the underlying network characteristics, much of the
evaluation of Peer-to-Peer proposals has used simplified models that fail to include a detailed model of
the underlying network. This can be largely attributed to the complexity in experimenting with a scalable
Peer-to-Peer system simulator built on top of a scalable network simulator. A major problem of
unstructured P2P systems is their heavy network traffic. In Peer-to-Peer context, a challenging problem
is how to find the appropriate peer to deal with a given query without overly consuming bandwidth?
Different methods proposed routing strategies of queries taking into account the P2P network at hand.
This paper considers an unstructured P2P system based on an organization of peers around Super-Peers
that are connected to Super-Super-Peer according to their semantic domains; in addition to integrating
Decision Trees in P2P architectures to produce Query-Suitable Super-Peers, representing a community
of peers where one among them is able to answer the given query. By analyzing the queries log file, a
predictive model that avoids flooding queries in the P2P network is constructed after predicting the
appropriate Super-Peer, and hence the peer to answer the query. A challenging problem in a schemabased Peer-to-Peer (P2P) system is how to locate peers that are relevant to a given query. In this paper,
architecture, based on (Super-)Peers is proposed, focusing on query routing. The approach to be
implemented, groups together (Super-)Peers that have similar interests for an efficient query routing
method. In such groups, called Super-Super-Peers (SSP), Super-Peers submit queries that are often
processed by members of this group. A SSP is a specific Super-Peer which contains knowledge about: 1.
its Super-Peers and 2. The other SSP. Knowledge is extracted by using data mining techniques (e.g.
Decision Tree algorithms) starting from queries of peers that transit on the network. The advantage of
this distributed knowledge is that, it avoids making semantic mapping between heterogeneous data
sources owned by (Super-)Peers, each time the system decides to route query to other (Super-) Peers.
The set of SSP improves the robustness in queries routing mechanism, and the scalability in P2P
Network. Compared with a baseline approach,the proposal architecture shows the effect of the data
mining with better performance in respect to response time and precision.
A Cooperative Peer Clustering Scheme for Unstructured Peer-to-Peer Systemsijp2p
This document summarizes a research paper that proposes a cooperative peer clustering scheme for unstructured peer-to-peer networks. The proposed scheme aims to improve search performance by identifying critical links between peers and allowing local reconfiguration while incorporating a retaliation rule to encourage cooperation. Simulation results indicate the proposed scheme improves search hit rates over previous schemes, and cooperative peers receive higher profits than selfish peers.
Online stream mining approach for clustering network trafficeSAT Journals
Abstract A large number of research have been proposed on intrusion detection system, which leads to the implementation of agent based intelligent IDS (IIDS), Non – intelligent IDS (NIDS), signature based IDS etc. While building such IDS models, learning algorithms from flow of network traffic plays crucial role in accuracy of IDS systems. The proposed work focuses on implementing the novel method to cluster network traffic which eliminates the limitations in existing online clustering algorithms and prove the robustness and accuracy over large stream of network traffic arriving at extremely high rate. We compare the existing algorithm with novel methods to analyse the accuracy and complexity. Keywords— NIDS, Data Stream Mining, Online Clustering, RAH algorithm, Online Efficient Incremental Clustering algorithm
Transfer reliability and congestion control strategies in opportunistic netwo...IEEEFINALYEARPROJECTS
The document discusses transfer reliability and congestion control strategies in opportunistic networks. It begins by stating that opportunistic networks have unpredictable node contacts and rarely have complete end-to-end paths. It then discusses how modified TCP protocols are ineffective for these networks and they require different approaches than intermittently connected networks. The document surveys proposals for transfer reliability using hop-by-hop custody transfer and end-to-end receipts. It also categorizes storage congestion control based on single or multiple message copies. It identifies open research issues including replication management and drop policies for multiple copies.
JAVA 2013 IEEE NETWORKING PROJECT Transfer reliability and congestion control...IEEEGLOBALSOFTTECHNOLOGIES
The document discusses transfer reliability and congestion control strategies in opportunistic networks. It begins by stating that opportunistic networks have unpredictable node contacts and rarely have complete end-to-end paths. It notes that modified TCP protocols are ineffective for these networks. The document then surveys proposals for ensuring reliable data transfer and avoiding network congestion in opportunistic networks. It categorizes existing proposals and identifies mechanisms like hop-by-hop custody transfer and end-to-end receipts for reliability. For congestion control, it discusses replication management, drop policies, and considering message copy numbers. The document concludes by identifying open research challenges.
A NEW ALGORITHM FOR CONSTRUCTION OF A P2P MULTICAST HYBRID OVERLAY TREE BASED...csandit
In the last decade Peer to Peer technology has been thoroughly explored, because it overcomes many limitations compared to the traditional client server paradigm. Despite its advantages over a traditional approach, the ubiquitous availability of high speed, high bandwidth and low latency networks has supported the traditional client-server paradigm. Recently, however, the surge of streaming services has spawned renewed interest in Peer to Peer technologies. In addition, services like geolocation databases and browser technologies like Web-RTC make a hybrid approach attractive.
A NEW ALGORITHM FOR CONSTRUCTION OF A P2P MULTICAST HYBRID OVERLAY TREE BASED...cscpconf
In the last decade Peer to Peer technology has been thoroughly explored, because it overcomes many limitations compared to the traditional client server paradigm. Despite its advantages over a traditional approach, the ubiquitous availability of high speed, high bandwidth and low latency networks has supported the traditional client-server paradigm. Recently, however, the surge of streaming services has spawned renewed interest in Peer to Peer technologies. In addition, services like geolocation databases and browser technologies like Web-RTC make a hybrid approach attractive. In this paper we present algorithms for the construction and the maintenance of a hybrid P2P overlay multicast tree based on topological distances. The essential idea of these algorithms is to build a multicast tree by choosing neighbours close to each other. The topological distances can be easily obtained by the browser using the geolocation API. Thus the implementation of algorithms can be done web-based in a distributed manner. We present proofs of our algorithms as well as practical results and evaluations.
A P2P Job Assignment Protocol For Volunteer Computing SystemsAshley Smith
This document proposes a peer-to-peer job assignment protocol for volunteer computing systems. It introduces a distributed algorithm that aims to efficiently distribute jobs to workers in a decentralized manner. The key aspects are:
1) Jobs are described by job adverts that are distributed to multiple job assigners by a job manager.
2) Workers request jobs by querying job assigners, who match workers to available job adverts.
3) Input data is retrieved in a similar decentralized fashion through data queries and responses from data centers caching input files.
4) A simulation study shows this decentralized approach can improve performance metrics like overall job completion time and network load balancing, compared to a centralized job assignment system.
Algorithm selection for sorting in embedded and mobile systemsJigisha Aryya
Algorithm selection aims to solve problems like sorting using the most efficient algorithm by analyzing data characteristics. For resource-constrained embedded systems, this can improve energy efficiency by reducing computation time and eliminating unnecessary software components. The document proposes using algorithm selection with a sliding window approach for data stream mining to sample and analyze data on-board instead of transmitting all data over bandwidth-limited wireless networks. This allows for sorting and other computations to be performed locally, saving energy compared to sending all data for remote processing. Eliminating the random number generator used for sorting could also improve energy efficiency.
IDENTIFICATION OF EFFICIENT PEERS IN P2P COMPUTING SYSTEM FOR REAL TIME APPLI...ijp2p
Currently the Peer-to-Peer computing paradigm rises as an economic solution for the large scale
computation problems. However due to the dynamic nature of peers it is very difficult to use this type of
systems for the computations of real time applications. Strict deadline of scientific and real time
applications require predictable performance in such applications. We propose an algorithm to identify the
group of reliable peers, from the available peers on the Internet, for the processing of real time
application’s tasks. The algorithm is based on joint evaluation of peer properties like peer availability,
credibility, computation time and the turnaround time of the peer with respect to the task distributor peer.
Here we also define a method to calculate turnaround time (distance) on task distributor peers at
application level.
IDENTIFICATION OF EFFICIENT PEERS IN P2P COMPUTING SYSTEM FOR REAL TIME APPLI...ijp2p
Currently the Peer-to-Peer computing paradigm rises as an economic solution for the large scale
computation problems. However due to the dynamic nature of peers it is very difficult to use this type of
systems for the computations of real time applications. Strict deadline of scientific and real time
applications require predictable performance in such applications. We propose an algorithm to identify the
group of reliable peers, from the available peers on the Internet, for the processing of real time
application’s tasks. The algorithm is based on joint evaluation of peer properties like peer availability,
credibility, computation time and the turnaround time of the peer with respect to the task distributor peer.
Here we also define a method to calculate turnaround time (distance) on task distributor peers at
application level.
final Year Projects, Final Year Projects in Chennai, Software Projects, Embedded Projects, Microcontrollers Projects, DSP Projects, VLSI Projects, Matlab Projects, Java Projects, .NET Projects, IEEE Projects, IEEE 2009 Projects, IEEE 2009 Projects, Software, IEEE 2009 Projects, Embedded, Software IEEE 2009 Projects, Embedded IEEE 2009 Projects, Final Year Project Titles, Final Year Project Reports, Final Year Project Review, Robotics Projects, Mechanical Projects, Electrical Projects, Power Electronics Projects, Power System Projects, Model Projects, Java Projects, J2EE Projects, Engineering Projects, Student Projects, Engineering College Projects, MCA Projects, BE Projects, BTech Projects, ME Projects, MTech Projects, Wireless Networks Projects, Network Security Projects, Networking Projects, final year projects, ieee projects, student projects, college projects, ieee projects in chennai, java projects, software ieee projects, embedded ieee projects, "ieee2009projects", "final year projects", "ieee projects", "Engineering Projects", "Final Year Projects in Chennai", "Final year Projects at Chennai", Java Projects, ASP.NET Projects, VB.NET Projects, C# Projects, Visual C++ Projects, Matlab Projects, NS2 Projects, C Projects, Microcontroller Projects, ATMEL Projects, PIC Projects, ARM Projects, DSP Projects, VLSI Projects, FPGA Projects, CPLD Projects, Power Electronics Projects, Electrical Projects, Robotics Projects, Solor Projects, MEMS Projects, J2EE Projects, J2ME Projects, AJAX Projects, Structs Projects, EJB Projects, Real Time Projects, Live Projects, Student Projects, Engineering Projects, MCA Projects, MBA Projects, College Projects, BE Projects, BTech Projects, ME Projects, MTech Projects, M.Sc Projects, Final Year Java Projects, Final Year ASP.NET Projects, Final Year VB.NET Projects, Final Year C# Projects, Final Year Visual C++ Projects, Final Year Matlab Projects, Final Year NS2 Projects, Final Year C Projects, Final Year Microcontroller Projects, Final Year ATMEL Projects, Final Year PIC Projects, Final Year ARM Projects, Final Year DSP Projects, Final Year VLSI Projects, Final Year FPGA Projects, Final Year CPLD Projects, Final Year Power Electronics Projects, Final Year Electrical Projects, Final Year Robotics Projects, Final Year Solor Projects, Final Year MEMS Projects, Final Year J2EE Projects, Final Year J2ME Projects, Final Year AJAX Projects, Final Year Structs Projects, Final Year EJB Projects, Final Year Real Time Projects, Final Year Live Projects, Final Year Student Projects, Final Year Engineering Projects, Final Year MCA Projects, Final Year MBA Projects, Final Year College Projects, Final Year BE Projects, Final Year BTech Projects, Final Year ME Projects, Final Year MTech Projects, Final Year M.Sc Projects, IEEE Java Projects, ASP.NET Projects, VB.NET Projects, C# Projects, Visual C++ Projects, Matlab Projects, NS2 Projects, C Projects, Microcontroller Projects, ATMEL Projects, PIC Projects, ARM Projects, DSP Projects, VLSI Projects, FPGA Projects, CPLD Projects, Power Electronics Projects, Electrical Projects, Robotics Projects, Solor Projects, MEMS Projects, J2EE Projects, J2ME Projects, AJAX Projects, Structs Projects, EJB Projects, Real Time Projects, Live Projects, Student Projects, Engineering Projects, MCA Projects, MBA Projects, College Projects, BE Projects, BTech Projects, ME Projects, MTech Projects, M.Sc Projects, IEEE 2009 Java Projects, IEEE 2009 ASP.NET Projects, IEEE 2009 VB.NET Projects, IEEE 2009 C# Projects, IEEE 2009 Visual C++ Projects, IEEE 2009 Matlab Projects, IEEE 2009 NS2 Projects, IEEE 2009 C Projects, IEEE 2009 Microcontroller Projects, IEEE 2009 ATMEL Projects, IEEE 2009 PIC Projects, IEEE 2009 ARM Projects, IEEE 2009 DSP Projects, IEEE 2009 VLSI Projects, IEEE 2009 FPGA Projects, IEEE 2009 CPLD Projects, IEEE 2009 Power Electronics Projects, IEEE 2009 Electrical Projects, IEEE 2009 Robotics Projects, IEEE 2009 Solor Projects, IEEE 2009 MEMS Projects, IEEE 2009 J2EE P
Privacy Preserving Reputation Calculation in P2P Systems with Homomorphic Enc...IJCNCJournal
This document discusses a method for privacy-preserving reputation calculation in peer-to-peer systems using homomorphic encryption. Specifically, it proposes:
1) Extending the EigenTrust reputation system to calculate node reputations in a distributed manner while preserving evaluator privacy. It does this by successively updating encrypted reputation values through calculation to reflect trust values without disclosing the original values.
2) Improving calculation efficiency by offloading parts of the task to participating nodes and using different public keys during calculation to improve robustness against node churn.
3) Evaluating the performance of the proposed method, finding it reduces maximum circulation time for aggregating multiplication results by half, reducing computation time per round. The privacy preservation cost scales
MEKDA: Multi-Level ECC based Key Distribution and Authentication in Internet ...IJCNCJournal
The Internet of Things (IoT) is an extensive system of networks and connected devices with minimal human interaction and swift growth. The constraints of the System and limitations of Devices pose several challenges, including security; hence billions of devices must protect from attacks and compromises. The resource-constrained nature of IoT devices amplifies security challenges. Thus standard data communication and security measures are inefficient in the IoT environment. The ubiquity of IoT devices and their deployment in sensitive applications increase the vulnerability of any security breaches to risk lives. Hence, IoT-related security challenges are of great concern. Authentication is the solution to the vulnerability of a malicious device in the IoT environment. The proposed Multi-level Elliptic Curve Cryptography based Key Distribution and Authentication in IoT enhances the security by Multi-level Authentication when the devices enter or exit the Cluster in an IoT system. The decreased Computation Time and Energy Consumption by generating and distributing Keys using Elliptic Curve Cryptography extends the availability of the IoT devices. The Performance analysis shows the improvement over the Fast Authentication and Data Transfer method.
Ontology-Based Routing for Large-Scale Unstructured P2P Publish/Subscribe Systemtheijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
EFFECTIVE TOPOLOGY-AWARE PEER SELECTION IN UNSTRUCTURED PEER-TO-PEER SYSTEMSijp2p
Peer-to-Peer systems form logical overlay networks on top of the Internet. Essentially, peers randomly
choose logical neighbours without any knowledge about underlying physical topology. This may cause
inefficient communications among peers. This topology mismatch problem may result in poor
performance and scalability for Peer-to-Peer systems. A possible way to improve the performance of
Peer-to-Peer systems is the overlay network construction based on the knowledge of the physical network
topology. In this paper, we will propose the use of the “Record Route” and “Timestamp” options
supported in the IP protocol to explore the paths between peers. By the topology-aware peer selection,
our approach outperforms traditional P2P systems using random peer selection. Our approach only
incurs a low overhead and can be deployed easily in various P2P systems.
PUBLIC INTEGRIYT AUDITING FOR SHARED DYNAMIC DATA STORAGE UNDER ONTIME GENERA...paperpublications3
Abstract: Nowadays verifying the result of the remote computation plays a crucial role in addressing in issue of trust. The outsourced data collection comes for multiple data sources to diagnose the originator of errors by allotting each data sources a unique secrete key which requires the inner product conformation to be performed under any two parties different keys. The proposed methods outperform AISM technique to minimize the running time. The multi-key setting is given different secrete keys, multiple data sources can be upload their data streams along with their respective verifiable homomorphic tag. The AISM consist of three novel join techniques depending on the ADS availability: (i) Authenticated Indexed Sort Merge Join (AISM), which utilizes a single ADS on the join attribute, (ii) Authenticated Index Merge Join (AIM) that requires an ADS (on the join attribute) for both relations, and (iii) Authenticated Sort Merge Join (ASM), which does not rely on any ADS. The client should allow choosing any portion in the data streams for queries. The communication between the client and server is independent of input size. The inner product evaluation can be performed by any two sources and the result can be verified by using the particular tag.
Keywords: Computation of outsourcing, Data Stream, Multiple Key, Homomorphic encryption.
Title: PUBLIC INTEGRIYT AUDITING FOR SHARED DYNAMIC DATA STORAGE UNDER ONTIME GENERATED MULTIPLE KEYS
Author: C. NISHA MALAR, M. S. BONSHIA BINU
ISSN 2350-1049
International Journal of Recent Research in Interdisciplinary Sciences (IJRRIS)
Paper Publications
The peer-reviewed International Journal of Engineering Inventions (IJEI) is started with a mission to encourage contribution to research in Science and Technology. Encourage and motivate researchers in challenging areas of Sciences and Technology.
SECURITY CONSIDERATION IN PEER-TO-PEER NETWORKS WITH A CASE STUDY APPLICATIONIJNSA Journal
Peer-to-Peer (P2P) overlay networks wide adoption has also created vast dangers due to the millions of users who are not conversant with the potential security risks. Lack of centralized control creates great risks to the P2P systems. This is mainly due to the inability to implement proper authentication approaches for threat management. The best possible solutions, however, include encryption, utilization of administration, implementing cryptographic protocols, avoiding personal file sharing, and unauthorized downloads. Recently a new non-DHT based structured P2P system is very suitable for designing secured communication protocols. This approach is based on Linear Diophantine Equation (LDE) [1]. The P2P architectures based on this protocol offer simplified methods to integrate symmetric and asymmetric cryptographies’ solutions into the P2P architecture with no need of utilizing Transport Layer Security (TLS), and its predecessor, Secure Sockets Layer (SSL) protocols.
A COOPERATIVE LOCALIZATION METHOD BASED ON V2I COMMUNICATION AND DISTANCE INF...IJCNCJournal
Relative positions are recent solutions to overcome the limited accuracy of GPS in urban environment.
Vehicle positions obtained using V2I communication are more accurate because the known roadside unit
(RSU) locations help predict errors in measurements over time. The accuracy of vehicle positions depends
more on the number of RSUs; however, the high installation cost limits the use of this approach. It also
depends on nonlinear localization nature. They were neglected in several research papers. In these studies,
the accumulated errors increased with time due to the linearity localization problem. In the present study,
a cooperative localization method based on V2I communication and distance information in vehicular
networks is proposed for improving the estimates of vehicles’ initial positions. This method assumes that
the virtual RSUs based on mobility measurements help reduce installation costs and facilitate in handling
fault environments. The extended Kalman filter algorithm is a well-known estimator in nonlinear problem,
but it requires well initial vehicle position vector and adaptive noise in measurements. Using the proposed
method, vehicles’ initial positions can be estimated accurately. The experimental results confirm that the
proposed method has superior accuracy than existing methods, giving a root mean square error of
approximately 1 m. In addition, it is shown that virtual RSUs can assist in estimating initial positions in
fault environments.
Mobile Hosts Participating in Peer-to-Peer Data Networks: Challenges and Solu...Zhenyun Zhuang
Wireless Networks (2010)
http://dl.acm.org/citation.cfm?id=1873504
Peer-to-peer (P2P) data networks dominate
Internet traffic, accounting for over 60% of the overall
traffic in a recent study. In this work, we study the
problems that arise when mobile hosts participate in
P2P networks. We primarily focus on the performance
issues as experienced by the mobile host, but also study
the impact on other fixed peers. Using BitTorrent as a
key example, we identify several unique problems that
arise due to the design aspects of P2P networks being
incompatible with typical characteristics of wireless
and mobile environments. Using the insights gained
through our study, we present a wireless P2P (wP2P)
client application that is backward compatible with existing
fixed-peer client applications, but when used on
mobile hosts can provide significant performance improvements.
P2P DOMAIN CLASSIFICATION USING DECISION TREE ijp2p
The increasing interest in Peer-to-Peer systems (such as Gnutella) has inspired many research activities
in this area. Although many demonstrations have been performed that show that the performance of a
Peer-to-Peer system is highly dependent on the underlying network characteristics, much of the
evaluation of Peer-to-Peer proposals has used simplified models that fail to include a detailed model of
the underlying network. This can be largely attributed to the complexity in experimenting with a scalable
Peer-to-Peer system simulator built on top of a scalable network simulator. A major problem of
unstructured P2P systems is their heavy network traffic. In Peer-to-Peer context, a challenging problem
is how to find the appropriate peer to deal with a given query without overly consuming bandwidth?
Different methods proposed routing strategies of queries taking into account the P2P network at hand.
This paper considers an unstructured P2P system based on an organization of peers around Super-Peers
that are connected to Super-Super-Peer according to their semantic domains; in addition to integrating
Decision Trees in P2P architectures to produce Query-Suitable Super-Peers, representing a community
of peers where one among them is able to answer the given query. By analyzing the queries log file, a
predictive model that avoids flooding queries in the P2P network is constructed after predicting the
appropriate Super-Peer, and hence the peer to answer the query. A challenging problem in a schemabased Peer-to-Peer (P2P) system is how to locate peers that are relevant to a given query. In this paper,
architecture, based on (Super-)Peers is proposed, focusing on query routing. The approach to be
implemented, groups together (Super-)Peers that have similar interests for an efficient query routing
method. In such groups, called Super-Super-Peers (SSP), Super-Peers submit queries that are often
processed by members of this group. A SSP is a specific Super-Peer which contains knowledge about: 1.
its Super-Peers and 2. The other SSP. Knowledge is extracted by using data mining techniques (e.g.
Decision Tree algorithms) starting from queries of peers that transit on the network. The advantage of
this distributed knowledge is that, it avoids making semantic mapping between heterogeneous data
sources owned by (Super-)Peers, each time the system decides to route query to other (Super-) Peers.
The set of SSP improves the robustness in queries routing mechanism, and the scalability in P2P
Network. Compared with a baseline approach,the proposal architecture shows the effect of the data
mining with better performance in respect to response time and precision.
A Cooperative Peer Clustering Scheme for Unstructured Peer-to-Peer Systemsijp2p
This document summarizes a research paper that proposes a cooperative peer clustering scheme for unstructured peer-to-peer networks. The proposed scheme aims to improve search performance by identifying critical links between peers and allowing local reconfiguration while incorporating a retaliation rule to encourage cooperation. Simulation results indicate the proposed scheme improves search hit rates over previous schemes, and cooperative peers receive higher profits than selfish peers.
Online stream mining approach for clustering network trafficeSAT Journals
Abstract A large number of research have been proposed on intrusion detection system, which leads to the implementation of agent based intelligent IDS (IIDS), Non – intelligent IDS (NIDS), signature based IDS etc. While building such IDS models, learning algorithms from flow of network traffic plays crucial role in accuracy of IDS systems. The proposed work focuses on implementing the novel method to cluster network traffic which eliminates the limitations in existing online clustering algorithms and prove the robustness and accuracy over large stream of network traffic arriving at extremely high rate. We compare the existing algorithm with novel methods to analyse the accuracy and complexity. Keywords— NIDS, Data Stream Mining, Online Clustering, RAH algorithm, Online Efficient Incremental Clustering algorithm
Transfer reliability and congestion control strategies in opportunistic netwo...IEEEFINALYEARPROJECTS
The document discusses transfer reliability and congestion control strategies in opportunistic networks. It begins by stating that opportunistic networks have unpredictable node contacts and rarely have complete end-to-end paths. It then discusses how modified TCP protocols are ineffective for these networks and they require different approaches than intermittently connected networks. The document surveys proposals for transfer reliability using hop-by-hop custody transfer and end-to-end receipts. It also categorizes storage congestion control based on single or multiple message copies. It identifies open research issues including replication management and drop policies for multiple copies.
JAVA 2013 IEEE NETWORKING PROJECT Transfer reliability and congestion control...IEEEGLOBALSOFTTECHNOLOGIES
The document discusses transfer reliability and congestion control strategies in opportunistic networks. It begins by stating that opportunistic networks have unpredictable node contacts and rarely have complete end-to-end paths. It notes that modified TCP protocols are ineffective for these networks. The document then surveys proposals for ensuring reliable data transfer and avoiding network congestion in opportunistic networks. It categorizes existing proposals and identifies mechanisms like hop-by-hop custody transfer and end-to-end receipts for reliability. For congestion control, it discusses replication management, drop policies, and considering message copy numbers. The document concludes by identifying open research challenges.
A NEW ALGORITHM FOR CONSTRUCTION OF A P2P MULTICAST HYBRID OVERLAY TREE BASED...csandit
In the last decade Peer to Peer technology has been thoroughly explored, because it overcomes many limitations compared to the traditional client server paradigm. Despite its advantages over a traditional approach, the ubiquitous availability of high speed, high bandwidth and low latency networks has supported the traditional client-server paradigm. Recently, however, the surge of streaming services has spawned renewed interest in Peer to Peer technologies. In addition, services like geolocation databases and browser technologies like Web-RTC make a hybrid approach attractive.
A NEW ALGORITHM FOR CONSTRUCTION OF A P2P MULTICAST HYBRID OVERLAY TREE BASED...cscpconf
In the last decade Peer to Peer technology has been thoroughly explored, because it overcomes many limitations compared to the traditional client server paradigm. Despite its advantages over a traditional approach, the ubiquitous availability of high speed, high bandwidth and low latency networks has supported the traditional client-server paradigm. Recently, however, the surge of streaming services has spawned renewed interest in Peer to Peer technologies. In addition, services like geolocation databases and browser technologies like Web-RTC make a hybrid approach attractive. In this paper we present algorithms for the construction and the maintenance of a hybrid P2P overlay multicast tree based on topological distances. The essential idea of these algorithms is to build a multicast tree by choosing neighbours close to each other. The topological distances can be easily obtained by the browser using the geolocation API. Thus the implementation of algorithms can be done web-based in a distributed manner. We present proofs of our algorithms as well as practical results and evaluations.
A P2P Job Assignment Protocol For Volunteer Computing SystemsAshley Smith
This document proposes a peer-to-peer job assignment protocol for volunteer computing systems. It introduces a distributed algorithm that aims to efficiently distribute jobs to workers in a decentralized manner. The key aspects are:
1) Jobs are described by job adverts that are distributed to multiple job assigners by a job manager.
2) Workers request jobs by querying job assigners, who match workers to available job adverts.
3) Input data is retrieved in a similar decentralized fashion through data queries and responses from data centers caching input files.
4) A simulation study shows this decentralized approach can improve performance metrics like overall job completion time and network load balancing, compared to a centralized job assignment system.
Algorithm selection for sorting in embedded and mobile systemsJigisha Aryya
Algorithm selection aims to solve problems like sorting using the most efficient algorithm by analyzing data characteristics. For resource-constrained embedded systems, this can improve energy efficiency by reducing computation time and eliminating unnecessary software components. The document proposes using algorithm selection with a sliding window approach for data stream mining to sample and analyze data on-board instead of transmitting all data over bandwidth-limited wireless networks. This allows for sorting and other computations to be performed locally, saving energy compared to sending all data for remote processing. Eliminating the random number generator used for sorting could also improve energy efficiency.
IDENTIFICATION OF EFFICIENT PEERS IN P2P COMPUTING SYSTEM FOR REAL TIME APPLI...ijp2p
Currently the Peer-to-Peer computing paradigm rises as an economic solution for the large scale
computation problems. However due to the dynamic nature of peers it is very difficult to use this type of
systems for the computations of real time applications. Strict deadline of scientific and real time
applications require predictable performance in such applications. We propose an algorithm to identify the
group of reliable peers, from the available peers on the Internet, for the processing of real time
application’s tasks. The algorithm is based on joint evaluation of peer properties like peer availability,
credibility, computation time and the turnaround time of the peer with respect to the task distributor peer.
Here we also define a method to calculate turnaround time (distance) on task distributor peers at
application level.
IDENTIFICATION OF EFFICIENT PEERS IN P2P COMPUTING SYSTEM FOR REAL TIME APPLI...ijp2p
Currently the Peer-to-Peer computing paradigm rises as an economic solution for the large scale
computation problems. However due to the dynamic nature of peers it is very difficult to use this type of
systems for the computations of real time applications. Strict deadline of scientific and real time
applications require predictable performance in such applications. We propose an algorithm to identify the
group of reliable peers, from the available peers on the Internet, for the processing of real time
application’s tasks. The algorithm is based on joint evaluation of peer properties like peer availability,
credibility, computation time and the turnaround time of the peer with respect to the task distributor peer.
Here we also define a method to calculate turnaround time (distance) on task distributor peers at
application level.
final Year Projects, Final Year Projects in Chennai, Software Projects, Embedded Projects, Microcontrollers Projects, DSP Projects, VLSI Projects, Matlab Projects, Java Projects, .NET Projects, IEEE Projects, IEEE 2009 Projects, IEEE 2009 Projects, Software, IEEE 2009 Projects, Embedded, Software IEEE 2009 Projects, Embedded IEEE 2009 Projects, Final Year Project Titles, Final Year Project Reports, Final Year Project Review, Robotics Projects, Mechanical Projects, Electrical Projects, Power Electronics Projects, Power System Projects, Model Projects, Java Projects, J2EE Projects, Engineering Projects, Student Projects, Engineering College Projects, MCA Projects, BE Projects, BTech Projects, ME Projects, MTech Projects, Wireless Networks Projects, Network Security Projects, Networking Projects, final year projects, ieee projects, student projects, college projects, ieee projects in chennai, java projects, software ieee projects, embedded ieee projects, "ieee2009projects", "final year projects", "ieee projects", "Engineering Projects", "Final Year Projects in Chennai", "Final year Projects at Chennai", Java Projects, ASP.NET Projects, VB.NET Projects, C# Projects, Visual C++ Projects, Matlab Projects, NS2 Projects, C Projects, Microcontroller Projects, ATMEL Projects, PIC Projects, ARM Projects, DSP Projects, VLSI Projects, FPGA Projects, CPLD Projects, Power Electronics Projects, Electrical Projects, Robotics Projects, Solor Projects, MEMS Projects, J2EE Projects, J2ME Projects, AJAX Projects, Structs Projects, EJB Projects, Real Time Projects, Live Projects, Student Projects, Engineering Projects, MCA Projects, MBA Projects, College Projects, BE Projects, BTech Projects, ME Projects, MTech Projects, M.Sc Projects, Final Year Java Projects, Final Year ASP.NET Projects, Final Year VB.NET Projects, Final Year C# Projects, Final Year Visual C++ Projects, Final Year Matlab Projects, Final Year NS2 Projects, Final Year C Projects, Final Year Microcontroller Projects, Final Year ATMEL Projects, Final Year PIC Projects, Final Year ARM Projects, Final Year DSP Projects, Final Year VLSI Projects, Final Year FPGA Projects, Final Year CPLD Projects, Final Year Power Electronics Projects, Final Year Electrical Projects, Final Year Robotics Projects, Final Year Solor Projects, Final Year MEMS Projects, Final Year J2EE Projects, Final Year J2ME Projects, Final Year AJAX Projects, Final Year Structs Projects, Final Year EJB Projects, Final Year Real Time Projects, Final Year Live Projects, Final Year Student Projects, Final Year Engineering Projects, Final Year MCA Projects, Final Year MBA Projects, Final Year College Projects, Final Year BE Projects, Final Year BTech Projects, Final Year ME Projects, Final Year MTech Projects, Final Year M.Sc Projects, IEEE Java Projects, ASP.NET Projects, VB.NET Projects, C# Projects, Visual C++ Projects, Matlab Projects, NS2 Projects, C Projects, Microcontroller Projects, ATMEL Projects, PIC Projects, ARM Projects, DSP Projects, VLSI Projects, FPGA Projects, CPLD Projects, Power Electronics Projects, Electrical Projects, Robotics Projects, Solor Projects, MEMS Projects, J2EE Projects, J2ME Projects, AJAX Projects, Structs Projects, EJB Projects, Real Time Projects, Live Projects, Student Projects, Engineering Projects, MCA Projects, MBA Projects, College Projects, BE Projects, BTech Projects, ME Projects, MTech Projects, M.Sc Projects, IEEE 2009 Java Projects, IEEE 2009 ASP.NET Projects, IEEE 2009 VB.NET Projects, IEEE 2009 C# Projects, IEEE 2009 Visual C++ Projects, IEEE 2009 Matlab Projects, IEEE 2009 NS2 Projects, IEEE 2009 C Projects, IEEE 2009 Microcontroller Projects, IEEE 2009 ATMEL Projects, IEEE 2009 PIC Projects, IEEE 2009 ARM Projects, IEEE 2009 DSP Projects, IEEE 2009 VLSI Projects, IEEE 2009 FPGA Projects, IEEE 2009 CPLD Projects, IEEE 2009 Power Electronics Projects, IEEE 2009 Electrical Projects, IEEE 2009 Robotics Projects, IEEE 2009 Solor Projects, IEEE 2009 MEMS Projects, IEEE 2009 J2EE P
Cloud Camp Milan 2K9 Telecom Italia: Where P2P?Gabriele Bozzi
1. The document discusses the potential for peer-to-peer (P2P) computing as an alternative or complement to the traditional client-server model, especially in the context of cloud computing.
2. It notes challenges with P2P such as lack of centralized control and potential for freeloading, but also advantages like harnessing unused resources.
3. Emerging technologies like autonomic and cognitive networking aim to address P2P challenges by enabling self-configuration and optimization of distributed resources.
1. The document discusses the potential for peer-to-peer (P2P) computing as an alternative or complement to the traditional client-server model, especially in the context of cloud computing.
2. P2P systems offer access to distributed resources but lack centralized control, which makes it difficult to ensure reliability, performance, and security.
3. Autonomic and cognitive approaches may help address issues with P2P by enabling self-configuration, healing, optimization and protection of distributed resources.
4. Future networking approaches like DirecNet envision high-speed mobile mesh networks that could further enable wide-scale distributed computing architectures.
This paper presents a technique for end hosts to detect if intermediaries like routers are applying compression to traffic flows without the end hosts' knowledge. The technique is non-intrusive and only uses packet inter-arrival times for detection, requiring no changes to or cooperation from intermediaries. Simulations and internet experiments show the approach can accurately detect compression applied by intermediaries. The technique could help end hosts optimize their own use of compression resources by avoiding redundant compression when intermediaries are already compressing traffic.
The document outlines a final project to design and implement a basic node-based networking system without internet service providers. It proposes using GPS coordinates to assign addresses and simulating data transfer between nodes to test performance. The key steps are:
1) Assign each node a unique address based on its GPS coordinates.
2) Specify node properties like transfer rate and range.
3) Simulate data transfer between random node pairs to determine latency and throughput.
4) Analyze the results to find worst-case performance and compare to typical internet speeds.
The goal is to evaluate if a decentralized mesh network could feasibly replace traditional internet infrastructure. The document describes addressing schemes, node specifications, and the process
In the last decade Peer to Peer technology has been thoroughly explored, becauseit overcomes many limitations compared to the traditional client server paradigm. Despite its advantages over a traditional approach, the ubiquitous availability of high speed, high bandwidth and low latency networks has supported the traditional client-server paradigm. Recently, however, the surge of streaming services has spawned renewed interest in Peer to Peer technologies. In addition, services like geolocation databases and browser technologies like Web-RTC make a hybrid approach attractive.
In this paper we present algorithms for the construction and the maintenance of a hybrid P2P overlay multicast tree based on topological distances. The essential idea of these algorithms is to build a multicast tree by choosing neighbours close to each other. The topological distances can be easily obtained by the browser using the geolocation API. Thus the implementation of algorithms can be done web-based in a distributed manner.
We present proofs of our algorithms as well as experimental results and evaluations.
Node selection in p2 p content sharing service in mobile cellular networks wi...Uvaraj Shan
This document discusses node selection algorithms for peer-to-peer content sharing over mobile cellular networks that consider downlink bandwidth limitations. It proposes two novel algorithms (DBaT-B and DBaT-N) that select peer nodes to maximize load balancing across cells while meeting the requesting peer's bandwidth needs. DBaT-B selects peers to satisfy the requesting peer's minimum bandwidth requirement, while DBaT-N selects a certain number of peers as requested. Both algorithms first choose peers in the least busy cell to improve load balancing.
Node selection in p2 p content sharing service in mobile cellular networks wi...Uvaraj Shan
The document discusses node selection algorithms for peer-to-peer content sharing over mobile cellular networks that consider downlink bandwidth limitations. It proposes two algorithms: DBaT-B selects peers to meet a minimum requested bandwidth sum, prioritizing load balancing across cells. DBaT-N selects a requested number of peers where the bandwidth sum exceeds the downlink limit, again balancing cell loads. Both aim to satisfy bandwidth demands while distributing traffic evenly across the network. The paper then evaluates the algorithms' performance through simulation.
Node selection in p2 p content sharing service in mobile cellular networks wi...Uvaraj Shan
This document discusses node selection algorithms for peer-to-peer content sharing over mobile cellular networks that consider downlink bandwidth limitations. It proposes two novel algorithms (DBaT-B and DBaT-N) that select peer nodes to maximize load balancing across cells while meeting the requesting peer's bandwidth needs. DBaT-B selects peers to satisfy the requesting peer's minimum bandwidth requirement, while DBaT-N selects a certain number of peers as requested. Both algorithms first choose peers in the least busy cell to improve load balancing.
International Journal of Computational Engineering Research(IJCER)ijceronline
International Journal of Computational Engineering Research(IJCER) is an intentional online Journal in English monthly publishing journal. This Journal publish original research work that contributes significantly to further the scientific knowledge in engineering and Technology
This document discusses big data mining and the Internet of Things. It first presents challenges with big data mining including modeling big data characteristics, identifying key challenges, and issues with statistical analysis of IoT data. It then describes an architecture called IOT-StatisticDB that provides a generalized schema for storing sensor data from IoT devices and a distributed system for parallel computing and statistical analysis of IoT big data. The system includes query operators for data retrieval and statistical analysis of IoT data in areas like transportation networks.
Similar to EXPLORING PEER-TO-PEER DATA MINING (20)
ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR cscpconf
The progressive development of Synthetic Aperture Radar (SAR) systems diversify the exploitation of the generated images by these systems in different applications of geoscience. Detection and monitoring surface deformations, procreated by various phenomena had benefited from this evolution and had been realized by interferometry (InSAR) and differential interferometry (DInSAR) techniques. Nevertheless, spatial and temporal decorrelations of the interferometric couples used, limit strongly the precision of analysis results by these techniques. In this context, we propose, in this work, a methodological approach of surface deformation detection and analysis by differential interferograms to show the limits of this technique according to noise quality and level. The detectability model is generated from the deformation signatures, by simulating a linear fault merged to the images couples of ERS1 / ERS2 sensors acquired in a region of the Algerian south.
4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATIONcscpconf
A novel based a trajectory-guided, concatenating approach for synthesizing high-quality image real sample renders video is proposed . The lips reading automated is seeking for modeled the closest real image sample sequence preserve in the library under the data video to the HMM predicted trajectory. The object trajectory is modeled obtained by projecting the face patterns into an KDA feature space is estimated. The approach for speaker's face identification by using synthesise the identity surface of a subject face from a small sample of patterns which sparsely each the view sphere. An KDA algorithm use to the Lip-reading image is discrimination, after that work consisted of in the low dimensional for the fundamental lip features vector is reduced by using the 2D-DCT.The mouth of the set area dimensionality is ordered by a normally reduction base on the PCA to obtain the Eigen lips approach, their proposed approach by[33]. The subjective performance results of the cost function under the automatic lips reading modeled , which wasn’t illustrate the superior performance of the
method.
MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...cscpconf
Universities offer software engineering capstone course to simulate a real world-working environment in which students can work in a team for a fixed period to deliver a quality product. The objective of the paper is to report on our experience in moving from Waterfall process to Agile process in conducting the software engineering capstone project. We present the capstone course designs for both Waterfall driven and Agile driven methodologies that highlight the structure, deliverables and assessment plans.To evaluate the improvement, we conducted a survey for two different sections taught by two different instructors to evaluate students’ experience in moving from traditional Waterfall model to Agile like process. Twentyeight students filled the survey. The survey consisted of eight multiple-choice questions and an open-ended question to collect feedback from students. The survey results show that students were able to attain hands one experience, which simulate a real world-working environment. The results also show that the Agile approach helped students to have overall better design and avoid mistakes they have made in the initial design completed in of the first phase of the capstone project. In addition, they were able to decide on their team capabilities, training needs and thus learn the required technologies earlier which is reflected on the final product quality
PROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIEScscpconf
This document discusses using social media technologies to promote student engagement in a software project management course. It describes the course and objectives of enhancing communication. It discusses using Facebook for 4 years, then switching to WhatsApp based on student feedback, and finally introducing Slack to enable personalized team communication. Surveys found students engaged and satisfied with all three tools, though less familiar with Slack. The conclusion is that social media promotes engagement but familiarity with the tool also impacts satisfaction.
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGICcscpconf
In real world computing environment with using a computer to answer questions has been a human dream since the beginning of the digital era, Question-answering systems are referred to as intelligent systems, that can be used to provide responses for the questions being asked by the user based on certain facts or rules stored in the knowledge base it can generate answers of questions asked in natural , and the first main idea of fuzzy logic was to working on the problem of computer understanding of natural language, so this survey paper provides an overview on what Question-Answering is and its system architecture and the possible relationship and
different with fuzzy logic, as well as the previous related research with respect to approaches that were followed. At the end, the survey provides an analytical discussion of the proposed QA models, along or combined with fuzzy logic and their main contributions and limitations.
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS cscpconf
Human beings generate different speech waveforms while speaking the same word at different times. Also, different human beings have different accents and generate significantly varying speech waveforms for the same word. There is a need to measure the distances between various words which facilitate preparation of pronunciation dictionaries. A new algorithm called Dynamic Phone Warping (DPW) is presented in this paper. It uses dynamic programming technique for global alignment and shortest distance measurements. The DPW algorithm can be used to enhance the pronunciation dictionaries of the well-known languages like English or to build pronunciation dictionaries to the less known sparse languages. The precision measurement experiments show 88.9% accuracy.
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS cscpconf
In education, the use of electronic (E) examination systems is not a novel idea, as Eexamination systems have been used to conduct objective assessments for the last few years. This research deals with randomly designed E-examinations and proposes an E-assessment system that can be used for subjective questions. This system assesses answers to subjective questions by finding a matching ratio for the keywords in instructor and student answers. The matching ratio is achieved based on semantic and document similarity. The assessment system is composed of four modules: preprocessing, keyword expansion, matching, and grading. A survey and case study were used in the research design to validate the proposed system. The examination assessment system will help instructors to save time, costs, and resources, while increasing efficiency and improving the productivity of exam setting and assessments.
TWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTICcscpconf
African Buffalo Optimization (ABO) is one of the most recent swarms intelligence based metaheuristics. ABO algorithm is inspired by the buffalo’s behavior and lifestyle. Unfortunately, the standard ABO algorithm is proposed only for continuous optimization problems. In this paper, the authors propose two discrete binary ABO algorithms to deal with binary optimization problems. In the first version (called SBABO) they use the sigmoid function and probability model to generate binary solutions. In the second version (called LBABO) they use some logical operator to operate the binary solutions. Computational results on two knapsack problems (KP and MKP) instances show the effectiveness of the proposed algorithm and their ability to achieve good and promising solutions.
DETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAINcscpconf
In recent years, many malware writers have relied on Dynamic Domain Name Services (DDNS) to maintain their Command and Control (C&C) network infrastructure to ensure a persistence presence on a compromised host. Amongst the various DDNS techniques, Domain Generation Algorithm (DGA) is often perceived as the most difficult to detect using traditional methods. This paper presents an approach for detecting DGA using frequency analysis of the character distribution and the weighted scores of the domain names. The approach’s feasibility is demonstrated using a range of legitimate domains and a number of malicious algorithmicallygenerated domain names. Findings from this study show that domain names made up of English characters “a-z” achieving a weighted score of < 45 are often associated with DGA. When a weighted score of < 45 is applied to the Alexa one million list of domain names, only 15% of the domain names were treated as non-human generated.
GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...cscpconf
The document proposes a blockchain-based digital currency and streaming platform called GoMAA to address issues of piracy in the online music streaming industry. Key points:
- GoMAA would use a digital token on the iMediaStreams blockchain to enable secure dissemination and tracking of streamed content. Content owners could control access and track consumption of released content.
- Original media files would be converted to a Secure Portable Streaming (SPS) format, embedding watermarks and smart contract data to indicate ownership and enable validation on the blockchain.
- A browser plugin would provide wallets for fans to collect GoMAA tokens as rewards for consuming content, incentivizing participation and addressing royalty discrepancies by recording
IMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEMcscpconf
This document discusses the importance of verb suffix mapping in discourse translation from English to Telugu. It explains that after anaphora resolution, the verbs must be changed to agree with the gender, number, and person features of the subject or anaphoric pronoun. Verbs in Telugu inflect based on these features, while verbs in English only inflect based on number and person. Several examples are provided that demonstrate how the Telugu verb changes based on whether the subject or pronoun is masculine, feminine, neuter, singular or plural. Proper verb suffix mapping is essential for generating natural and coherent translations while preserving the context and meaning of the original discourse.
EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...cscpconf
In this paper, based on the definition of conformable fractional derivative, the functional
variable method (FVM) is proposed to seek the exact traveling wave solutions of two higherdimensional
space-time fractional KdV-type equations in mathematical physics, namely the
(3+1)-dimensional space–time fractional Zakharov-Kuznetsov (ZK) equation and the (2+1)-
dimensional space–time fractional Generalized Zakharov-Kuznetsov-Benjamin-Bona-Mahony
(GZK-BBM) equation. Some new solutions are procured and depicted. These solutions, which
contain kink-shaped, singular kink, bell-shaped soliton, singular soliton and periodic wave
solutions, have many potential applications in mathematical physics and engineering. The
simplicity and reliability of the proposed method is verified.
AUTOMATED PENETRATION TESTING: AN OVERVIEWcscpconf
The document discusses automated penetration testing and provides an overview. It compares manual and automated penetration testing, noting that automated testing allows for faster, more standardized and repeatable tests but has limitations in developing new exploits. It also reviews some current automated penetration testing methodologies and tools, including those using HTTP/TCP/IP attacks, linking common scanning tools, a Python-based tool targeting databases, and one using POMDPs for multi-step penetration test planning under uncertainty. The document concludes that automated testing is more efficient than manual for known vulnerabilities but cannot replace manual testing for discovering new exploits.
CLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORKcscpconf
Since the mid of 1990s, functional connectivity study using fMRI (fcMRI) has drawn increasing
attention of neuroscientists and computer scientists, since it opens a new window to explore
functional network of human brain with relatively high resolution. BOLD technique provides
almost accurate state of brain. Past researches prove that neuro diseases damage the brain
network interaction, protein- protein interaction and gene-gene interaction. A number of
neurological research paper also analyse the relationship among damaged part. By
computational method especially machine learning technique we can show such classifications.
In this paper we used OASIS fMRI dataset affected with Alzheimer’s disease and normal
patient’s dataset. After proper processing the fMRI data we use the processed data to form
classifier models using SVM (Support Vector Machine), KNN (K- nearest neighbour) & Naïve
Bayes. We also compare the accuracy of our proposed method with existing methods. In future,
we will other combinations of methods for better accuracy.
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...cscpconf
The document proposes a new validation method for fuzzy association rules based on three steps: (1) applying the EFAR-PN algorithm to extract a generic base of non-redundant fuzzy association rules using fuzzy formal concept analysis, (2) categorizing the extracted rules into groups, and (3) evaluating the relevance of the rules using structural equation modeling, specifically partial least squares. The method aims to address issues with existing fuzzy association rule extraction algorithms such as large numbers of extracted rules, redundancy, and difficulties with manual validation.
PROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATAcscpconf
In many applications of data mining, class imbalance is noticed when examples in one class are
overrepresented. Traditional classifiers result in poor accuracy of the minority class due to the
class imbalance. Further, the presence of within class imbalance where classes are composed of
multiple sub-concepts with different number of examples also affect the performance of
classifier. In this paper, we propose an oversampling technique that handles between class and
within class imbalance simultaneously and also takes into consideration the generalization
ability in data space. The proposed method is based on two steps- performing Model Based
Clustering with respect to classes to identify the sub-concepts; and then computing the
separating hyperplane based on equal posterior probability between the classes. The proposed
method is tested on 10 publicly available data sets and the result shows that the proposed
method is statistically superior to other existing oversampling methods.
CHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCHcscpconf
Data collection is an essential, but manpower intensive procedure in ecological research. An
algorithm was developed by the author which incorporated two important computer vision
techniques to automate data cataloging for butterfly measurements. Optical Character
Recognition is used for character recognition and Contour Detection is used for imageprocessing.
Proper pre-processing is first done on the images to improve accuracy. Although
there are limitations to Tesseract’s detection of certain fonts, overall, it can successfully identify
words of basic fonts. Contour detection is an advanced technique that can be utilized to
measure an image. Shapes and mathematical calculations are crucial in determining the precise
location of the points on which to draw the body and forewing lines of the butterfly. Overall,
92% accuracy were achieved by the program for the set of butterflies measured.
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...cscpconf
Smart cities utilize Internet of Things (IoT) devices and sensors to enhance the quality of the city
services including energy, transportation, health, and much more. They generate massive
volumes of structured and unstructured data on a daily basis. Also, social networks, such as
Twitter, Facebook, and Google+, are becoming a new source of real-time information in smart
cities. Social network users are acting as social sensors. These datasets so large and complex
are difficult to manage with conventional data management tools and methods. To become
valuable, this massive amount of data, known as 'big data,' needs to be processed and
comprehended to hold the promise of supporting a broad range of urban and smart cities
functions, including among others transportation, water, and energy consumption, pollution
surveillance, and smart city governance. In this work, we investigate how social media analytics
help to analyze smart city data collected from various social media sources, such as Twitter and
Facebook, to detect various events taking place in a smart city and identify the importance of
events and concerns of citizens regarding some events. A case scenario analyses the opinions of
users concerning the traffic in three largest cities in the UAE
SOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGEcscpconf
The anonymity of social networks makes it attractive for hate speech to mask their criminal
activities online posing a challenge to the world and in particular Ethiopia. With this everincreasing
volume of social media data, hate speech identification becomes a challenge in
aggravating conflict between citizens of nations. The high rate of production, has become
difficult to collect, store and analyze such big data using traditional detection methods. This
paper proposed the application of apache spark in hate speech detection to reduce the
challenges. Authors developed an apache spark based model to classify Amharic Facebook
posts and comments into hate and not hate. Authors employed Random forest and Naïve Bayes
for learning and Word2Vec and TF-IDF for feature selection. Tested by 10-fold crossvalidation,
the model based on word2vec embedding performed best with 79.83%accuracy. The
proposed method achieve a promising result with unique feature of spark for big data.
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXTcscpconf
This article presents Part of Speech tagging for Nepali text using General Regression Neural
Network (GRNN). The corpus is divided into two parts viz. training and testing. The network is
trained and validated on both training and testing data. It is observed that 96.13% words are
correctly being tagged on training set whereas 74.38% words are tagged correctly on testing
data set using GRNN. The result is compared with the traditional Viterbi algorithm based on
Hidden Markov Model. Viterbi algorithm yields 97.2% and 40% classification accuracies on
training and testing data sets respectively. GRNN based POS Tagger is more consistent than the
traditional Viterbi decoding technique.
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceIndexBug
Imagine a world where machines not only perform tasks but also learn, adapt, and make decisions. This is the promise of Artificial Intelligence (AI), a technology that's not just enhancing our lives but revolutionizing entire industries.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
2. 166 Computer Science & Information Technology (CS & IT)
Distributed Data Mining deals with data analysis in those environments in which data are
distributed as for peer-to-peer networks and offers an alternative way to address this problem.
Researchers have developed several approaches for computing primitive operations (sum,
average, max) on P2P networks. In this report we are going to introduce two different
approaches: the first one is based on the concept of local algorithms[1], algorithms computing
their results just with communications between immediate neighbors; the second one is based on
the Newscast model of computation [4][3], a probabilistic epidemic protocol for information and
membership dissemination.
In the next sections we will first give a brief overview on P2P data mining, its motivations and
goals; then (section 3) we will introduce the concept of local algorithms [1] giving some
examples; in section 4 we will introduce the Newscast model and give an idea of how it works
along with some practical primitive implementations; finally in the last section we will draw
some conclusions.
2. GOAL IN PEER-TO-PEER DATA MINING
One of the main goal of P2P data mining is to achieve as closer as possible the same results that
can be obtained with a centralized approach without moving any data from the original location.
That's why such algorithms must be highly scalable, tolerant to crashes and to peer “churn"2
and
mainly they must be able to calculate the results in-network instead of loading all the data in an
unique system and then apply to itself the traditional data mining techniques.
As just said, there are several properties that are required by a peer-to-peer data mining
algorithm:
scalability has been already mentioned and it is the foremost requirement; algorithms for peer-to-
peer networks must be independent of the size of the network or at most they must be dependent
of the log of the size;
anytimeness means that, since in some application the rate of the data change may be higher than
the rate of computation, the algorithm must be able to provide a good and ad hoc solution at any
time;
asynchronism is also crucial requirements for P2P algorithms: P2P networks are often huge, that
means that any attempt to synchronize between the entire network is vane due to network latency
or limited bandwidth;
decentralization means that the computations must be done in network, hence no centralized
coordination must be used;
fault-tolerance is an other issue we have already mentioned; in a large P2P network can often
happen that nodes crash or that they want to leave or join the network. That's why P2P algorithms
________________
2
By "Churn" we mean the situation in which some nodes leave the network and are suddenly replaced with
brand new ones.
3. Computer Science & Information Technology (CS & IT) 167
should be able to recover from these situations.
In the next sections we will present some primitive designed for Peer-to-Peer Data Mining
purposes which show to comply with the above mentioned needs.
3. LOCAL ALGORITHMS
Approaches to P2P data mining have focused on developing data mining primitive operations
over the network as well as more complicated data analysis algorithms.
Datta in [1] proposes some algorithms for calculating such primitives as sum and average, basing
on the definition of local algorithms.
Given a constant k, for any network dimension, we can say that an algorithm is a local algorithm
if there is a part of the input for which the algorithm terminates with communications expended
per peer no greater than k and on the rest of the inputs, the communication expended per peer is
of the order of the network size. Them can be divided into two categories: exact local algorithm
and approximate local algorithms: the former always terminate granting the same results that can
be obtained with centralized methods; the latter instead, they can not give such level of accuracy.
Of course exact local algorithms can give better results, but it is not possible to develop them for
every kind of situation.
In the next subsections will be given examples of both exact an approximate local algorithms.
3.1 Exact local algorithms (The Majority voting problem)
The Majority voting problem is a typical example of exact local algorithm which represents a
situation in which each peer (Pi) of a P2P network has a number bi which may be 0 or 1 and a
threshold ( each node has the same ). Peers want to collectively determine if
the sum of all bi is greater than , where is the number of peers in the network.
Addressing this problem can serve as a primitive for several kind of data mining algorithms and
can be used as a primitive for more complicated exact local algorithms.
Pi is a generic node in the network, N eii is one of its neighbors, Ci represents an estimate of the
number of the nodes in the network and Si is an estimate of the global sum. All the peers can
only communicate with its neighbors and it is through this way that the just mentioned estimates
are updated. We also define the threshold belief of Pi when such peer believes or not the majority
threshold is met
Hence this threshold belief depends on the exchange of informations between neighbors. The
crucial point of this approach concerns in deciding whether Pi needs to send a message to its
neighbor Pj from which it has just received information on C and S. Pi will not send such a
message if and only if it can be certain that such information cannot modify the threshold belief
_________________
3
Actually in the original paper [1] was just indicated
4. 168 Computer Science & Information Technology (CS & IT)
of Pj . On the other end, if it cannot be certain of this, a message must be sent. This decision is
taken on the basis of the estimate Pi makes on the values of Cj and Sj together with its values (Ci
and Si). When a node Pi decides to send a message, then it sends all of its informations about S
and C, excluding those sent from Pj .
This approach is considered to be robust to data and network change: when a peer Pi changes its
data, then Pi recomputes Ci and Si and applies those conditions to all of its neighbors; if a peer Pj
leaves the network, then Pi recomputes Si and Ci without taking into account the informations
from Pj .
Discussion
Exact local algorithms can be very useful in solving data mining problems in P2P networks but
unfortunately they are very limited. The scope of such algorithms is restricted to functions that
can have a local representation in the given network and they are limited to those problems which
can be reduced to threshold predicates (as for the majority voting problem). An example of
application is given in [1], where an exact local algorithm (based on the majority voting problem)
is used for monitoring a K-means clustering of data distributed over a peer-to-peer network. In
this application K-means clustering is performed in the traditional way (on a centralized system).
After this, results are sent to the peers of the network and the local algorithm for monitoring the
K-means clustering is executed: this algorithm just raises an alert if the centroids needs to be
updated.
It is quite impossible to develop an exact local algorithm to compute the average of a set of data
distribute over the network, that is why it is impossible to solve some data mining problems with
exact local algorithms (as for example P2P K-means clustering). For addressing this, two
different mechanism are proposed: the first one (section 3.2) describes an approximate local
algorithm to perform the K-means clustering over a P2P network [2]; the second one (section 4)
describes the Newscast model of computation for calculating means over data distributed on P2P
overlay networks.
3.2 Approximate local algorithms (P2P K-means clustering)
The P2P K-means clustering algorithm [2] is an iterative algorithm requiring only local
communications and synchronization at each iteration: nodes exchange messages and
synchronize only with their neighbors. The goal is for each node to converge on a set of centroids
that are as close as possible to the centroids that would have been produced if the data from all
nodes was first centralized, then K-means was run.
The algorithm is initiated with a set of starting centroid selected at random. P1, P2,...., Pn denote
the nodes in the network and Xi
denotes the data set held by each node; the global data set is
denoted as a X which equals to the list of neighbors of a generic node i is denoted by
Each node stores: a set indicating the centroids (local centroids) held by
the node i at the beginning of cycle l; a termination threshold and a cluster count
which is the number of tuple in for which is closer than any other
5. Computer Science & Information Technology (CS & IT) 169
Each iteration of the algorithm is divided in two steps: the first one is similar to the centralized K-
means in which peer Pi assigns each of its points to the nearest centroid; in the second one peer Pi
sends a poll message to its immediate neighbors and waits for a respond. This request consists of
a pair (id, current iteration number) which is done in order to make the neighbors to respond
with their local centroids and cluster count for iteration l. Once they have all responded, Pi
updates its jth
centroid at the beginning of iteration l + 1. The update is a weighted average4
of the
local centroids and counts received from all immediate neighbors (for their iteration l). Then it
moves to the next iteration of the K-means algorithm and repeats the whole process. If the
maximum change in position of the new centroid after an iteration remains above the defined
threshold then Pi goes on iteration l + 1.
The key point is how do the peers respond to those requests. Suppose peer Pi receives a poll
message from node k at its iteration Pi sends its local centroid and cluster count
to peer Pk; if that means Pi does not have local centroids for iteration and in such case,
the poll message is put in the poll table of checks if it contains local centroids and
cluster counts for Pk , if so, they are sent to Pk, else the poll message is put in the poll table.
Finally Pi will check its poll table at each iteration and will respond to any message it can.
At the end of each iteration l, if no important changes are detected on the cluster centroids (the
maximum change in position is below the user defined threshold ). each node could enter a
termination state. In that case, such peer no longer updates its centroids and sends poll messages.
However, it does responds to polling messages from its neighbors. Once all peers enter into the
terminated state, the algorithm is terminated.
Experiments results and discussion
The P2P K-means clustering algorithm [1] presented in the previous section is a very important
example which gives the idea of how is important to implement good primitives for data mining
in distributed environments. The algorithm is in fact based on a primitive derived from the
majority voting problem which is a primitive developed for peer-to-peer systems.
Datta in its works [1][2] performed several experiments with its P2P K-means clustering
algorithm which seems to achieve good results. Experiments were conducted in both static and
dynamic environments with a network of 1000 nodes: in both cases was calculated the accuracy
with respect to the classic centralized K-means and the communication cost. In the static
environment high accuracy is found (less than 3% of points per node misclassified on average)
while no significant impact on this has had the method of assigning data points to node
(uniformly or non-uniformly). However this has had a significant impact on communication
costs: the number of bytes received per node increases slowly with network size for uniform
assignment; the cost increases more sharply for non-uniform assignment.
Experiments were also conducted on a dynamic environment (with nodes leaving and joining the
network) and even here good accuracy was found (less than 3:5% misclassified on average)
which remains stable when the network evolves. Even increasing the network size did not seems
to change the accuracy significantly: this proves that the algorithm is highly scalable.
______________________
4
Such weighted average is calculated by primitive implemented with local algorithms.
6. 170 Computer Science & Information Technology (CS & IT)
4. DATA MINING THROUGH THE NEWCAST MODEL OF
COMPUTATION
As already said, researchers working on distributed data mining have been focusing on
investigating techniques for calculating primitives as average, maximum etc.. in distributed
environment. Even Kowalczyk and Jelasity in [4] focused on this, but their approach is slightly
different from the Datta's one [2].
First of all they adopted two important constraints: the first is that all the nodes (peers) store as
few as one single data instances (in [1] each peer held several data instances); the second is that
there is practically no limit on the number of nodes (even in [1] there was no potential limit on
this, but experiments were always conducted on small networks of the order of thousand nodes).
Furthermore, as in [1] nodes can leave and join the network as in a dynamic environment. For
this, in this approach are needed resources that scale directly with the size of the network, which
is a feature distinguishing it from local algorithms.
In the next subsections we are going to first introduce the Newscast model of computation, then
we will propose some primitive for distributed Data Mining within this model and finally we will
draw some conclusions.
4.1 The Newscast model (an overview)
Newscast [3] is a gossip-based topology manager protocol. Its aim is to continoulsy rewire the
(logical) connections between hosts. The rewiring process is designed in such a way that the
resulting overlay is very close to a random graph. The generated topology is thus very stable and
provides robust connectivity.
As in any large P2P system, a node only knows about a small fixed set of other nodes (due to
scalability issues), called neighbors. In Newscast, the neighborhood is represented by a partial,
fixed c size view of node descriptors composed by a node address and a logical time-stamp (e.g.,
the descriptor creation time).
The protocol behavior performs the following actions: selects first a neighbour from the local
view, exchanges the view with the neighbor, then both participants update their actual view
according to the received view. The data actually sent over the network by any Newscast node is
represented by the node's own descriptor plus its local view.
In Newscast, the neighbor selection process is performed in a random fashion by the
SELECTPEER() method. The UPDATE() method is the Newscast core behavior. It merges
a received view (sent by a node using SENDSTATE()) with the current peer view
in a temporary view list. Finally, Newscast trims this list to obtain the new c size view. The node
descriptors discarded are chosen from the most old" ones, according to the descriptor time-
stamp. This approach changes continuously the node descriptors hold in each node view; this
implies a continuous rewiring of the graph defined by the set of all node views.
The protocol always tends to inject new informations in the system and allows an automatic
elimination of old node descriptors using the aging approach. This feature is particularly
desirable to remove crashed node descriptors and thus to repair the overlay with minor efforts.
7. Computer Science & Information Technology (CS & IT) 171
Newscast is also cheap in terms of network communication. The traffic generated by the protocol
involves the exchange of a few hundred bytes per cycle for each peer.
Figure 1: Conceptual model of the collective of agents and the news agency (from [4]).
Trying to talk about this model on an higher level, we can say that it is based on two main
concepts: the collective of agents and the news agency (see Figure 1). The agents communicate
through the news agency. Although the news agency plays the role of a server orchestrating the
communication schedule, it is a virtual entity implemented in a fully distributed P2P solution.
The communication schedule is organized into cycles and in each of them the news agency
collect exactly one news item from all the agents. At the same time it prepares for each agent a
randomly chosen set of news item from the previous cycle and delivers these set to the agents. In
the next subsection we present an example of averaging primitives implemented within the
model.
4.2 Basic Statistics
The ability of calculating the mean is central for implementing some basic data mining
algorithms in Newscast. As we said the Newscast communication schedule is organized into
cycles so what we want is an algorithm able to calculate in few cycles the average of the values
held by the nodes of the Newscast Network. In this section we present three averaging algorithms
developed on Newscast.
4.2.1 Basic Averaging (BA)
The Basic Average algorithm [4] is the simplest way to achieve this. During the first cycle each
agent publishes its value so that the news agency get a copy of all the values that must be
averaged. Next all agents whenever they receive news, they make the average of these values and
then publish it. An important observation must be done: in every cycle the news agency receives
a set of values that on average has the same mean of the original set, but the variance will be
getting smaller and smaller with the number of cycles. This is the most important result of this
algorithm. From the experiments performed, it can be shown that after k iterations of the
"averaging operations", the variance drops to of its initial value.
8. 172 Computer Science & Information Technology (CS & IT)
As already said this is probably the simplest averaging algorithm that we can think on the
Newscast model, but its simplicity pays the lack of adaptation. In fact if we think to a network
where nodes leave and join, change their values and where nodes can temporary or permanently
crash, the BA is not able to fit with these dynamical needs. To address this, the Systematic
Averaging algorithm [4] is proposed (next section).
4.2.2 Systematic Averaging (SA)
The SA algorithm [4] achieves adaptation by constantly propagating agents current values and
temporal averages through the news agency. Therefore, any change in the incoming data will
quickly affect the final result.
A small positive integer d is fixed and it is used to control the depth of the propagation process.
The news items are vectors of d + 1 elements: the first element of a news item X is x0 and
contains the agent value (called 0-order estimate of the mean); x1 is the means of two 0-order
estimates and it is called 1-order estimation mean; xd, which is the last element, is the average of
two estimates of order d -1 and is called an estimate of order d. In this way consecutive elements
of X will be balanced. The result of this propagation is represented by xd.
Even the SA algorithm has the ability to drop the variance: it decreases in an exponential way.
Moreover the system can react to changes in the input data within d iterations.
4.2.3 Cumulative Averaging (CA)
The two algorithm we have just seen have the ability of reducing the variance very fast, but, due
to the randomness characterizing the Newscast engine, the output value could be different from
the true mean.
This problem is solved by the CA algorithm [4]. It runs two processes in parallel: in the first one
agents updates their estimate of the mean of the incoming data, while in the second one the mean
of these estimates is collectively calculated by a BA procedure.
4.3 Experiments configurations and results
The experiments we are going to describe [4] relates to the three averaging algorithms we have
just mentioned. These are based on tests with different network sizes (from 10000 to 50000) and
different data set. For each configuration were executed 100 independent runs. Were also used
three different kind of data sets: Gaussian, where the value of each agent was taken independently
from a Gaussian distribution; half-half, where half of the agents had value 0 and the other half 1;
and peak where all but one agents hold the value 0.
With respect to the convergence rate the BA algorithm was fastest (20-30 iteration), the SA was
slower (50 iterations) and the CA was the slowest (100 iterations). With respect to accuracy, the
situation was inverted: BA was worst, SA better and CA best. The deviation from the true mean
depends on the used distribution. Peak distribution has the intent of showing the "true power of
the averaging algorithm", in fact we can note from the results with such distribution that the mean
and variance with BA are respectively 0.935 and 0.656, while with SA they are 0.98 and 0.265.
9. Computer Science & Information Technology (CS & IT) 173
4.4 Applications
As already seen with the primitives calculated with local algorithms, even here the averaging
primitives implemented through the Newscast model can be used in several data mining tasks as
for example for classification techniques. In [4] is reported an example in which the above
mentioned averaging algorithms, with a little modification, are used for finding the Naive Bayes
classifier for data that are arbitrarily distributed among the nodes in a P2P Network.
Kowalczyk et al. have implemented the Naive Bayes algorithm using BA as the base averaging
method. As they had to maintain estimates of several means, they had to represent news items by
vectors of the same length as the number of the means and to run BA on all coordinates at the
same time. They have then tested the performances of that algorithm performing several
experiments and then comparing the results with a classical Naive Bayes centralized algorithm. In
most cases, although the model parameters were slightly different, no difference in the
classification rate were found.
Most statistics that are used by other classification algorithms are defined in terms of ratios (or
probabilities) that have the same form as described above. Consequently they can be
implemented within the newscast framework.
5. CONCLUDING DISCUSSION
In this paper we have described in brief two interesting approaches to Data Mining in Peer-to-
Peer Systems. The first one from Datta et al. [1] is based on the concept of local algorithms and
the second one from Kowalczyk et al. [4] uses the Newscast model of computation.
Both the authors are intent to supply techniques for calculating primitives for P2P networks,
primitives that form the basis for more complicated Data Mining algorithms. Both the approaches
result very interesting even though they differs in some aspects.
Calculating primitives through the Newscast model of computation, resulted a winning strategy.
The main task the authors wanted to deal with, was seeking a model for data spread over a
number of agents; this was addressed through Newscast which is based on an epidemic protocol
for disseminating information and group membership. The fact that the model lies on such robust
and highly scalable protocol, states the goodness of the model itself, which inherits all these good
features.
One important thing the two approaches have in common is that in both cases peers
communicates only with their immediate neighbors but the second one (Newscast) does this in an
epidemic-style manner so that results can be spread very quickly and all the agents can hear about
the final solution in a short amount of time.
An important difference between the Newscast model and local algorithms which derives from
what we have just said, is that in the Newscast model the terminations is reached once all agents
have heard about the final results: although there is no signal that informs the agents that the
result is found, using the theory of epidemic algorithms (which works as a broadcasting
mechanism), all agents will hear about the final solution very quickly. Local algorithms instead,
they terminates once a certain threshold is met: when an agent has reached an user defined
10. 174 Computer Science & Information Technology (CS & IT)
threshold (in the P2P K-means clustering it related the change in position of the centroids), it
enters the terminated state; once all agents have reached such state, the whole process stops.
An other main dfference between the two approaches is that the News cast model requires
resources that scale directly with the size of the network. So the resources required by the
algorithm are dependent from the size of the system. In spite of this, the model results very
scalable and robust. Local algorithms instead, computes their results using informations from a
handful of nearby neighbors which leads to a good level of scalability too. It has also been proved
that they are very good at adjusting to failure and changes in the input locally (see section 3.1).
Even with the Newscast model it is possible to achieve these properties: we have seen the
Systematic Average algorithm which is able to adjust to changes in the value of the agents on-the-
fly and we have also seen that the tendency of the protocol to insert new informations in the
system, allowing an automatic elimination of old node descriptors, is particularly desirable to
remove crashed node descriptors and thus to repair the overlay with minor efforts.
In light of this, we can't say if one of the two approaches is better than the other one. We can
certainly state (basing on the provided results) that both are able to fit with the peer-to-peer
networks requirements: they are highly scalable, robust to node crashes and data set changes,
decentralized and asynchronous. Once applied to real P2P Data Mining algorithms as K-means
clustering and Naive Bayes classification, they have also shown a good level of accuracy and
convergence to the results obtained with traditional centralized techniques.
In spite of this, data analysis in P2P systems still offers lot of challenges for the researchers. The
experiments on the primitive and the distributed data mining algorithms we have described in this
report, have shown good results but they come from lots of simulations done on P2P networks
testbeds, hence we have “no mathematical proofs" of their absolute validity. As future work, it
would be interesting and challenging at the same time to test this approaches on a platform like
PlanetLab (a reliable testbed for overlay networks)5
, or even better on a real Peer-to-Peer Overlay
Network.
REFERENCES
[1] Datta, S., Bhaduri, K., Giannella, C., Wol, R. (2005) Distributed Data Mining in Peer-to-Peer
Networks. Invited submission to the IEEE Internet Computing special issue on Distributed Data
Mining.
[2] Datta, S., Giannella, C., Kargupta, H. (2006) K-Means Clustering Over a Large, Dynamic Network.
Accepted paper in SIAM2006 Data Mining Conference.
[3] Jelasity, M., van Steen, M. (2002) Large-Scale Newscast Computing on the Internet. Internal Report
IR-503, Vrije Universiteit Amsterdam, Department of Computer Science, Amsterdam, The
Netherlands.
[4] Kowalczyk, W., Jelasity, M., Eiben, A. (2003) Towards Data Mining in Large and Fully Distributed
Peer-to-Peer Overlay Networks. Technical Report IR-AI-003, Vrije Univeriteit Amsterdam,
Department of Computer Science, Amsterdam, The Netherland.
_____________
5
http://www.planet-lab.org, Last visited March 2016