Peer to peer Paradigms

Peer to Peer Paradigms
Presented by: Hassan Shabir 57, Shanza Riaz 49
Tayyaba Bukhari 43, Mustejab Khurshid 26

Peer to Peer
 P2P computing is the sharing of computer resources and services by direct
exchange between systems.
 Peers are equally privileged
 Resources
 processing power
 disk storage or network bandwidth

Peer to Peer (cont’d)
• All nodes are both clients and servers
– Provide and consume data
– Any node can initiate a connection
Skype, Social Networking Apps

Peer to Peer (cont’d)
• A distributed system architecture
– No centralized control
– Typically many nodes, but unreliable and heterogeneous
– Nodes are symmetric in function
– Take advantage of distributed, shared resources
(bandwidth, CPU, storage) on peer-nodes
– Fault-tolerant, self-organizing
– Operate in dynamic environment, frequent join and
leave is the norm

Types of P2P systems
 Pure P2P system: a P2P system that has no central service
of any kind
• I.e., the entire communication occurs among connected peers without any
assistance from any server
Examples of pure P2P systems:
• Workgroups in Microsoft Windows Network
• Freenet

Types (cont’d)
 Hybrid P2P system: a P2P system which depends partially
on central servers or allocates selected functions to a subset
of dedicated peers
 Central servers act as central directories where either connected users or
indexed content can be mapped to the current location
 Dedicated peers direct control information among other peers

P2P Networks
• P2P networks generate more traffic than any
other internet application

Category of P2P Systems
• Unstructured
– No restriction on overlay structures and data
placement
– Napster, Gnutella, Kazza, Freenet, Bit torrent
• Structured
– Distributed hash tables (DHTs)
– Place restrictions on overlay structures and data
placement
– Chord, Pastry, Tapestry, CAN

Bittorent & Kademilia
By Tayyaba

Bit torrent
 One of many forms of p2p protocols for file-sharing.
 Created in 2001
 Estimated to account for 43% of all Internet traffic
 Many clients that work on bit torrent protocol
 UTorrent, Vuze, Bit torrent
 Most are of the Unstructured p2p network architecture
 Centralized
 tracker
 Most clients have started to implement DHT functions

Bit torrent
 Creates an application overlay network over existing internet infrastructure
 Peers when trying to download file, make request to the network and attempt to
get the most possible peers connected to download file
 Resources are not optimized and fairness is a concern
 Clients have started to implement DHT as a better way to connect to peers in order
to download files more efficiently.
 When new files are added to the, small data requests are carried out over TCP
connections to different machines in order to share the load of initial file sharer.
 Trackers assist in the communication between peers
 DHT would remove need for trackers

The .torrent File
Static file storing necessary meta information
Name
Size
Checksum
The content is divided into many “chunks” (e.g., 1/4 megabyte each)
Each chunk is hashed to a checksum value
When a peer later gets the chunks (from other peers), it can check the
authenticity by comparing the checksum
IP address and port of the Tracker

Tracker
Keeping track of peers
To allow peers to find one another
To return a random list of active peers

BitTorrent – joining a torrent
Peers divided into:
seeds: have the entire file
leechers: still downloading
1. Obtain the metadata file
2. Contact the tracker
3. Obtain a peer list (contains seeds & leechers)
4. Contact peers from that list for data

BitTorrent – exchanging data

Choking
 Ensures every nodes cooperate and prevents free-riding problem.
 Goal is to have several bidirectional connections running continuously.
 Choking is temporary refusal to upload, downloading occurs as normal.
 Connection is kept open so that setup costs are not borne again and again.
 At a given time only 4 best peers are un-choked.
 Evaluation on whom to choke/un-choke is performed every 10 seconds.
 Optimistic Un-choke every 30 seconds.
 Give a chance for newly joined peer to get data to download

Choking Algorithm
 Goal is to have several bidirectional connections running continuously
 Upload to peers who have uploaded to you recently
 Unutilized connections are uploaded to on a trial basis to see if better
transfer rates could be found using them

Kadmelia
 Distributed Hash Table for decentralized peer to peer computer network designed
by Petar Maymounkov and David Mazières in 2002
 Specifies the structure of the network and the exchange of information through
node lookups.
 Kademlia nodes communicate among themselves using UDP.
 Each node is identified by a number or node ID
 The node ID serves not only as identification, but the Kademlia algorithm uses the
node ID to locate values (usually file hashes or keywords).

Distance Calculation
 Kademlia uses a "distance" calculation between two nodes
 Distance is computed as the (XOR) of the two node IDs
 Keys and Node IDs have the same format and length
 Exclusive or was choosen because it acts as a distance function between all the node IDs.
 Specifically:
 The distance between a node and itself is zero
 It is symmetric: the "distances" calculated from A to B and from B to A are the same
 it follows the triangle inequality: given A, B and C are vertices (points) of a triangle, then the
distance from A to B is shorter than (or equal to) the sum of the distance from A to C and the
distance from C to B.
 A basic Kademlia network with 2n nodes will only take n steps (in the worst case)
to find that node.

Routing tables
 Consist of a list for each bit of the node ID
e.g. if a node ID consists of 128 bits, a node will keep 128 such lists
 Every entry in a list holds the necessary data to locate another node.
 Data in list contains
IP address, port, and node ID of another node
 The nth list must have a differing nth bit from the node's ID
 The first n-1 bits of the candidate ID must match those of the node's ID
 First list as 1/2 of the nodes in the network are far away candidates
 The next list can use only 1/4 of the nodes in the network (one bit closer than the first), etc.

Network partition for node 110

Kademlia : Lookup
When node 0011…… wants search 1110……

Protocol messages
 Kademlia has four messages
• PING — used to verify that a node is still alive.
• STORE — Stores a (key, value) pair in one node.
• FIND_NODE — The recipient of the request will return the k nodes in his own
buckets that are the closest ones to the requested key.
• FIND_VALUE — Same as FIND_NODE, but if the recipient of the request has the
requested key in its store, it will return the corresponding value.
Each RPC message includes a random value from the initiator. This ensures that
when the response is received it corresponds to the request previously sent

Use in file sharing networks
 Kademlia is used in file sharing networks.
 If a node wants to share a file
it processes the contents of the file, calculating from it a number (hash) that
will identify this file within the file-sharing network
 Hashes and the node IDs must be of the same length
 It then searches for several nodes whose ID is close to the hash
 A searching client will use Kademlia to search the network for the node
 whose ID has the smallest distance to the file hash, then will retrieve the sources
list that is stored in that node

Chord (peer-to-peer)
By Mustejab

Hash Tables
 Store arbitrary keys and satellite data
(value)
 put(key,value)
 value = get(key)
 Lookup must be fast
 Calculate hash function h() on key that
returns a storage cell
 Chained hash table: Store key
(and optional value) there

Distributed hash table
• Employ globally consistent protocol to ensure that any node can efficiently route a
search to some peer that has a desired file. This guarantee necessitates a more
structured pattern of overlay links. The most common form is Distributed Hash
Tables(DHT).
• DHT is a lookup service, that allows any participating node to efficiently retrieve the
value associated with a given key whether the file is new or older/rarer.
• Maintaining the mappings from keys to values is handled by nodes that any change
in the amount of participants causes minimal amount of disruption
• Allows for continual node arrival and departure, fault tolerant

 introduced in 2001 by Ion Stoica, Robert Morris, David Karger, Frans Kaashoek,
and Hari Balakrishnan,
 Protocol and algorithm for a peer-to-peer distributed hash table
 Distributed hash table stores key-value pairs
 By assigning keys to different computers (known as "nodes")
 A node will store the values for all the keys for which it is responsible
 Chord specifies how keys are assigned to nodes
 How a node can discover the value for a given key by first locating the node
responsible for that key.

 Allows nodes to join and leave the network without disruption
 Term node is used to refer to both a node itself and its identifier (ID)
without ambiguity
 So is the term key

Chord (cont'd)
 Each node has a successor and a predecessor
 Since nodes may disappear from the network, each node records several nodes
preceding it and following it
 Each node also maintains information about (at most) m other neighbors, called
fingers, in a finger table
 The i-th entry, i = 1, 2……,m, in the finger table of node N points to the node
whose ID is the smallest value bigger than or equal to N + 2i-1 (mod 2m) in the clock
wise direction

Chord routing algorithm
 The primary goal of the routing algorithm is to quickly locate the node responsible
for a particular key
 Chord routing works as follows:
1. A key lookup query is routed along the ID circle
2. Upon receiving a lookup query, the node rst checks if the lookup key ID falls
between this node ID + 1 and its successor ID
3. If it does, then the node returns the successor ID as the destination node and
terminates the lookup service
4. Otherwise, the node relays the lookup query to the node in its finger table with ID
closest to, but preceding, the lookup key ID
5. The relaying process proceeds iteratively until the destination node is found

Chord (cont'd)
 As a finger table stores at most m entries, its size is independent of the
number of keys or nodes in the network
 The Chord routing algorithm exploits the information stored in the finger
table of each node
• A node forwards queries for a key K to the closest predecessor of K on the ID
circle according to its finger table
• For distant keys K, queries are routed over large distances on the ID circle in a
single hop
• The closer the query gets to K, the more accurate the routing information of the
intermediate nodes on the location of K becomes

Chord (cont'd)
 It has been shown that the number of routing steps in Chord is at the order
of O(log N) , where N is the total number of nodes
 According to the Change of Base Theorem, when we talk about logarithmic growth, the base
of the logarithm is not important:
 loga N = C * logb N; C = loga b; a, b > 0; a, b != 1

Chord join/leave mechanisms
 Nodes join as follows:
 1 The newly arrived node first uses consistent hashing to generate its ID
 2 It then contacts the bootstrapping server to lookup the successor ID
 3 This successor node becomes new node's successor node
 4 The joining node is inserted into the overlay and takes on part of the successor
node's load
 5 The new node uses a stabilization protocol to verify its finger table
 To validate and update successor pointers as nodes join and leave
the system, the stabilization protocol is executed periodically at the
background of individual nodes
 When a node detects a failure of a finger during a lookup, it
chooses the next best preceding node from its finger table,

Chord (cont'd)
 m = 6 (i.e., modulo 2m = 64); 12 nodes; node 2 looks up key
45
 (1) N36 is the closest to key 45; (2) N43 immediately precedes key 45;
 (3) N58 is the rst successor of key 45 on the circle

Chord (cont'd)
 m = 6 (i.e., modulo 2m = 64); 12 nodes; node 12 looks up key
45
 (1) N30 immediately precedes key 45; (2) N38 immediately precedes key 45; (3)
N43 immediately precedes key 45; (4) N58 is the first successor of key 45 on the
circle

Chord: Finger-Table Routing
N32: N60, N80, N99
N99: N110, N5, N60
N5 : N10, N20, N32,
N60, N80
N10: N20, N32, N60
N80
N20: N32, N60, N99

Pastry
 Proposed in 2001 by Antony Rowstron and Peter Druschel
 Was developed at Microsoft Research, Ltd., Rice University, Purdue
University, and University of Washington
 Assigns ids to nodes, just like Chord (using a virtual ring).
 Structured P2P overlay in which objects can be efficiently located and
lookup queries efficiently routed

Pastry
 node IDs are 128-bit unsigned integers representing position in the circular
key-space
 routing overlay network is formed on top of the hash table
 by each peer discovering and exchanging state information
 list contains
• leaf nodes
• neighborhood list
• routing table
leaf node list consists of the L/2 closest peers by node ID in each direction
around the circle.

Pastry (cont'd)
 Routing table contains [log2b N] rows with 2b columns, where N is the
total number of Pastry nodes
• Contains all the information about particular nodes. E.g ip address
• The entries in row j refer to a node whose ID shares the present node ID only in the first j
digits
• Similar to Chord's finger table, it stores links into the ID space
 Leaf set: Each nodes knows its successor(s) and predecessor(s)
Like Chord's successor list
 Neighborhood set maintains information about nodes that are close
together in terms of network locality
 E.g., number of IP hops, Round-Trip Time (RTT) values.206, Lecture 4 October

Pastry routing algorithm
 The primary goal of the routing algorithm is to quickly locate the node
responsible for a particular key
 Pastry routing works as follows:
1. 1 Given a message with its key, the node first checks its leaf set
2. 2 If there is a node whose ID is closest to the key, the message is forwarded directly
to the node
3. 3 If the key is not covered by the leaf set, then the node checks the routing table and
the message is forwarded to a node that shares a common prefix with the key by
at least one more digit
4. 4 This way, with log2b N steps, the message can reach its destination node
Thus, the number of routing steps in Pastry is at the order of O(log N)

Pastry (cont'd)
 b = 4, base 2b = 16, N = 10, 000 nodes, [log16 10, 000] = 4
rows, node 63AB looks up key EB3E
1. From its routing table, node 63AB gets node E123, which shares 1-digit common
prefix with the key
2. Node E123 checks its routing table and gets node EB17, which shares 2-digit
common prefix with the key
3. Node EB17 then checks its routing table and gets node EB39, which shares 3-digit
common prefix with the key
4. Finally, node EB39 checks its leaf set and forwards the message directly to node
EB3E

 63AB ! E123 ! EB17 ! EB39 ! EB3E
 ". . . " represents arbitrary suxes in base 16
 IP address and port number associated with each entry are not shown

Pastry join/leave mechanisms
 Nodes join as follows:
 1 The joining node must know of at least another node already in the system
 2 It generates an ID for itself, and sends a join request to the known node
 3 The request will be routed to the node whose ID is numerically closest to the new
node ID
 4 All the nodes encountered on route to the destination will send their state tables
(routing table, leaf set, and neighborhood set) to the new node
 5 The new node will initialize its own state tables, and it will inform appropriate
nodes of its presence

Pastry join/leave mechanisms
 Nodes leave/failure as follows:
 1 Nodes in Pastry may fail or depart without any notice
 2 Routing table maintenance is handled by periodically exchanging keep-alive
messages among neighboring nodes
 3 If a node is unresponsive for a certain period, it is presumed failed
 4 All members of the failed node's leaf set are then notified and they update their
leaf sets
 With concurrent node failures, eventual message delivery is guaranteed unless l=2
or more nodes with adjacent IDs fail simultaneously
 Parameter l is an even integer with typical value of 16

Peer to peer Paradigms

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Peer to peer Paradigms

Similar to Peer to peer Paradigms (20)

Recently uploaded

Recently uploaded (20)

Peer to peer Paradigms