56. Communication Networks II Peer-To-Peer Networking Note: many Images were taken and adapted from contribution at the book “P2P Systems and Applications” Ed. Steinmetz, Wehrle H(„ my data “) = 3107 2207 7.31.10.25 peer-to-peer.info 12.5.7.31 95.7.6.10 86.8.10.18 planet-lab.org berkeley.edu 2906 3485 2011 1622 1008 709 611 61.51.166.150 ?
57.
58.
59.
60.
61.
Editor's Notes
Nicht standard protokoll, es ist das bekannteste in der Lehre Am einfachsten zum erklären. Kommt in freier Wildbahn aber kaum vor. Am häufigsten zitiert.
It has been observed by many measurement studies (e.g. Tran-Gia, Saroiu, etc.) that the rate at which peers join and leave the P2P systems is very high. This raises additional concerns, especially in cases where peers are assigned particular responsibilities and the connectivity of the overlay is not high enough to ensure that no partitioning of the system will take place. Overlay network design should not ignore the heterogeneity in node capabilities and behavior. Designing schemes that require homogenous components can either decrease the system capabilities to those achievable by the weakest components, or faulty/inefficient operation should be expected from the least capable nodes. Moreover, the observed variation in node behavior (e.g. up-time patterns) should be taken into account in the design of the overlay to increase the efficiency of the systems. Load-balance is the extent to which the load is evenly spread across nodes. The accounted load consists of the effort required for the basic overlay operations, e.g. maintenance, routing, indexing, caching, etc. Designing an overlay that avoids communication hot spots can increase the performance and the fault-tolerance of the overall system. On the other hand, by taking into account the heterogeneous environment , not all of the nodes are capable of offering the same amount of resources. A fair solution on the offered services should provide the incentives and the weighted balance between the resource contribution and the consumed overlay services. Security is the ability of a system to manage, protect and distribute sensitive information. In the context of P2P overlays security issues are basically raised by the presence of malicious peers , which do not forward or forward in the wrong direction received search requests . Additionally, selfish peers can behave in a way that could have similar results. Anonymity is the degree to which a system or component allows for (or supports) anonymous transactions. This is a special requirement for certain applications that require anti-censorship features or increase d privacy for the participating users. Of course misusage of those systems is always an issue (e.g. can be used to share illegal or inappropriate content) . (Introduce next slide) Meeting the complete set of the aforementioned requirements is not a trivial task. A number of trade-offs appear while designing P2P search services.
A trade-off which is common both in distributed and local search (although different techniques apply for each case) is the one between the time required to find the requested information versus the space that is required to store the information. In the case of distributed systems, the communication among peers is the most costly operation with respect to time and we will mainly consider only the number of messages exchanged and not local search operations on data structures. An example of a technique that favors time over space is the complete replication of the indexing information on every peer. No communication is required (at least for searching). An example of a technique that favors space over time is to assign every peer indexing responsibilities only for its local content. Every peer should contact every other peer to find all the information it searches (Gnutella approach). Another interesting trade-off appears between the request for security and privacy. Many security techniques require the logging of detailed information on the interactions among peers. This enables the easy tracing of peer‘s operations. On the other hand, the opposite effect appears in systems that provide anonymity on users‘ actions. More specifically for search operations, as it has already been mentioned, two main operations have been proposed: Looking for information that can be mapped in a simple key and searching for a set of results based on a more complex descripion. While the first approach can be very efficient in terms of workload and latency, the latter offers a more rich functionality. But in order to cover completely the matched items it comes at a high communication cost. In order to face the high communication cost, TTL-based solutions have been proposed to limit the network load. In that case, the overlay is not search exaustively and matching content will not be reported to the requestor. More advanced techniques can provide alternative trade-offs among these factors. P2P systems are supposed to be composed by autonomous entities. But for large scale systems this comes at a very high cost, either for search operations or the maintenance of the overlay. For this reason, alternatives have been proposed where hierarchical solutions introduced the concept of „super-peers“. Super-peers are peers with higher responsibilities that serve normal peers in certain aspects of the inter-peer collaboration. This introduces dependencies among the peers and can cause larger problems by (accidental or not) misbehavior of the super-peers. A different aspect on P2P algoritms is their reliability degree, which determines the additional operational overhead. This overhead can be required for example for the maintenance of the overlay structure that provides the degree of reliability. For example, deterministic maintenance algorithms require a very specific structure for the overlay. Althernatively, probabilistic approaches do not provide the same level of reliability, but they can provide a certain level on average at a much lower cost since they tolarate more overlay changes. (Introduce next slide) Until now we have described the environment, where the P2P search algorithms should operate, the general requirements and the introduced trade-offs. As a next step we will investigate the design dimensions of the most crucial component of a P2P system, the Overlay.