P2P systems are classified under two major categories centralized and decentralized. Example of centralized is Napster in which one server keeps the information of of all the other peers and decentralized is further divided into structured and unstructured. These are categorized under unstructured as they do not follow any structured way for file placement and do not optimize the search algorithm. Due to their unstructured way, they flood the queries in network and increase the network congestion…..whereas in structured they follow particular algorithms to search a file in the network.
Napster was the start of P2P and it could share only music files with peers. Every node uploads the list of shared files onto the server and whenever any peer search for a file, the server replies back with the list of nodes containing the file. User connect directly to remote peer and start download. However if the remote peer is behind a firewall, the peer send this information to the server and server forwards this request to the remote peer and then our node waits for the remote peer to connect in order to download the file.
Issues with Napster…Since it has a single server maintaining the list, the server is the single point of failure….Hence it is prone to denial of service…………However it ensures correct results till the time server is working properly…..as the list is uploaded directly to the server….. Search is centralized but the file transfer is peer to peer….
Gnutella could share any type of files, in opposition to Napster…..The search is decentralized….
Since the system is completely decentralized there is no single point of failure…………and this is less prone to denial of service….. However it cannot ensure correct results as one node may have the requested file but before the request reaches this node TTL is over and peer is refused about the file……. It increases network congestion as each time the query is broadcasted to all the neighbors…..
It connect different networks together…..each network has a super node that keeps the information of all the shared file by the nodes in that network…….
Seminar by:Anand Babuint82657@stud.uni-stuttgart.deInstitute for Parallel and DistributedSystems (IPVS)University of Stuttgart05/01/13Peer to Peer Content Delivery Networks1Peer-to-PeerContent DeliveryNetwork
Outline05/01/13Peer to Peer Content Delivery Networks2MotivationTraditional ApproachesP2P ArchitectureTypes of P2PCentralizedDecentralized Unstructured StructuredSummaryReferences
MotivationMillions of users want to download the same popular hugefiles (for free)E.g:Film, Video and musicMedia content from BroadcastersPersonal ContentSoftwareInstitutions05/01/13Peer to Peer Content Delivery Networks3
Router“Interested”End-hostSourceClient-Server05/01/13Peer to Peer Content Delivery Networks4
Router“Interested”End-hostSourceClient-ServerOverloaded!05/01/13Peer to Peer Content Delivery Networks5
Router“Interested”End-hostSourceIP multicast05/01/13Peer to Peer Content Delivery Networks6
Router“Interested”End-hostSourceEnd-host based multicast05/01/13Peer to Peer Content Delivery Networks7
End-host based multicast“Single-uploader” “Multiple-uploaders”Node that has downloaded file will then upload it to othernodes.Uploading costs amortized across all nodesAlso called “Application-level Multicast”Many protocols proposed early this decadeYoid (2000), Narada (2000), Overcast (2000), ALMI (2001)All use single treesProblem with single trees?05/01/13Peer to Peer Content Delivery Networks8
End-host multicast using singletreeSource05/01/13Peer to Peer Content Delivery Networks9
End-host multicast using singletreeSource05/01/13Peer to Peer Content Delivery Networks10
End-host multicast using singletreeSourceSlow data transfer05/01/13Peer to Peer Content Delivery Networks11
Why is P2P CDN important?P2P consumes significant amount of internet traffic todayIn 2004, Total P2P traffic was 60% (Source: Cachelogic)Slightly lower share in 2005 (possibly because of legal action),but still significantBT is the most popular P2P Protocol(30% in 2004)Well-Known BT users:05/01/13Peer to Peer Content Delivery Networks12
Peer-to-Peer System05/01/13Peer to Peer Content Delivery Networks13All nodes are both clientsand serversNo centralized datasourceScalableResistant to Flash crowdsCost Effective
Types of Peer-to-Peer SystemsCentralizedNapsterDecentralized Gnutella Fast-trackStructured Freenet Chord Pastry05/01/13Peer to Peer Content Delivery Networks14
Napster05/01/13Peer to Peer Content Delivery Networks15Only mp3Peer updates file list and the Napster databaseis updated periodically.User sends search request to the serverServer replies with the information of nodescontaining the fileUser connects directly to remote peer andstart download
Napster -- continued05/01/13Peer to Peer Content Delivery Networks16Search is centralized and dynamic.File transfer is direct (Peer to Peer)Pros and Cons:Fast and Efficient and up-to-date(no stale links)Single point of failure
Gnutella05/01/13Peer to Peer Content Delivery Networks17Share any type of filesDecentralized searchRequest send toneighbors(Flooding)Neighbor forwards it to itsneighbors.If TTL is over request isfinished.Users with matching file replies
Gnutella -- continued05/01/13Peer to Peer Content Delivery Networks18Decentralized systemNo Single point of failureLess Prone to denial of serviceFlooding queriesIncrease network congestionSearch only reaches to a subset of peers due toTTL.Compromise in Privacy as peers are able to seesearch queries.
Fast-trackHybrid of centralized Napsters anddecentralized Gnutella.Super Nodes acts as local search server Each super node act as a Napster server for asmall network Super nodes are chosen according to theircapacity and availabilityUser upload the list of shared files toa super-peerSuper nodes exchange the listperiodicallyPeer send the query to super node05/01/13Peer to Peer Content Delivery Networks19
BitTorrent“Pull-based”Each file split into smaller piecesNodes pull desired piecesPieces not downloaded in sequential orderPrevious multicast schemes aimed to support “streaming”; BitTorrent does not“swarming” approachEncourages contribution by all nodes05/01/13Peer to Peer Content Delivery Networks20
Basic ComponentsSeedPeer that has the entire fileLeacherPeer that has an incomplete copy of the fileA Torrent filePassive componentContains meta-data about the file to be downloaded and the peersTypically hosted on a web serverA TrackerCentral componentReturns a random list of peers with state information(Completed orDownloading)05/01/13Peer to Peer Content Delivery Networks21
Data typesAll the data used in Bit-torrent communication is Bencoded.Integer: 2011 Bencoded: i2011eString: “Something” Bencoded: 9: SomethingList: List=1337 List=“DEF” List=“CON” Bencoded:li1337e:3DEF:3CONeDictionary:Dictionary[“uname”]=“hpcbabu”Dictionary[“password”]=“default” Benocded formd5:uname7:hpcbabu8:password7:defaulte05/01/13Peer to Peer Content Delivery Networks22
Contents of .torrent filePiece length – Usually 256 KBPieces: SHA-1 hashes of all piecesSHA-1 hashes of each piece in fileFor reliabilityAnnounce Lists: List of all URL of trackersThe piece length and pieces information are fixed whileannounce lists are dynamic.05/01/13Peer to Peer Content Delivery Networks23
The big pictureThe big pictureWeb ServerBobTrackerDownloader:ASeeder:BDownloader:CHarry Potter.torrent05/01/13Peer to Peer Content Delivery Networks24
Request and ResponseScrape Requeste.g: http://example.com/scrape.php?info_hash=aaaaaaaaaaaaaaaaaaaa&info_hash=bbbbbbbbbbbbbbbbbbbb&info_hash=ccccccccccccccccccccScrape Responsee.g:d5:filesd20:....................d8:completei5e10:downloadedi50e10:incompletei10eeee5 seeders, 10 leechers, and 50 complete downloads05/01/13Peer to Peer Content Delivery Networks25
Request and ResponseAnnounce Request:e.g: http://some.tracker.com:999/announce ?info_hash=12345678901234567890&peer_id=ABCDEFGHIJKLMNOPQRST&ip=255.255.255.255&port=6881&downloaded=0&uploaded=0 &left=98765 &event=startedAnnounce Response:The tracker response is a BEncoded dictionary that has twokeys: interval and peers.05/01/13Peer to Peer Content Delivery Networks26
Peer wire Protocol(TCP)exchange of piecesThe file into several pieces and sub-pieces and aredownloaded from different peers.Each client will need to maintain the state information foreach peers. This list looks likeam_choking: this client is choking the peeram_interested: this client is interested in the peerpeer_choking: peer is choking this clientpeer_interested: peer is interested in this client05/01/13Peer to Peer Content Delivery Networks27
Steps in PWP:HandshakingMessage Communication Pipelining Piece selection strategyPeer selection strategyChoking and optimistic unchokingAnti-snubbingUpload-Only ModeEnd Game Mode05/01/13Peer to Peer Content Delivery Networks28
MessagingInitial handshake message:<pstrlen><pstr><reserved><info_hash><peer_id>An UDP ping request/response.All other messages are sent over TCP and are of the form: <length prefix><message ID><payload>Request:<len=013><id=6><index><begin><length>e.g.: have: <len=0005><id=4><piece index>choke: <len=0001><id=0>bitfield: <len=0001+X><id=5><bitfield>05/01/13Peer to Peer Content Delivery Networks29
PipeliningKeep unfulfilled requests on each connectionTo cut down the round-tripThis scheme has been found to saturate most connections inpracticeExtremely efficient over slow lines.Default - 505/01/13Peer to Peer Content Delivery Networks30
Piece Selectioncritical for performanceIf a bad algorithm is used all the effort would go waste.Until a piece is assembled, only download sub-pieces for thatpieceThis policy lets complete pieces assemble quickly05/01/13Peer to Peer Content Delivery Networks31
Rarest Piece FirstPolicy: Determine the pieces that are most rare among yourpeers and download those firstThis ensures that the most common pieces are left till theend to downloadRarest first also ensures that a large variety of pieces aredownloaded from the seed05/01/13Peer to Peer Content Delivery Networks32
Random First PieceInitially, a peer has nothing to tradeImportant to get a complete piece ASAPRare pieces are typically available at fewer peers, sodownloading a rare piece initially is not a good ideaPolicy: Select a random piece of the file and download it05/01/13Peer to Peer Content Delivery Networks33
Endgame ModePolicy: Last blocks trickle slowly in general. To speedthis up , send a request for all the missing blocks toevery peer.Send a cancel message to all peers whenever a blockarrives.This ensures that a download doesn’t get preventedfrom completion due to a single peer with a slowtransfer rateSome bandwidth is wasted, but in practice, this is nottoo much.05/01/13Peer to Peer Content Delivery Networks34
ChokingChoking is a temporary refusal to upload; downloading isnormalTit-for-tat strategyPeer A said to choke peer B if it (A) decides not to upload toBEach peer (say A) unchokes a certain number peers at anytime(default – 4)The three with the largest upload rates to AWhere the tit-for-tat comes inAnother randomly chosen (Optimistic Unchoke)To periodically look for better choices05/01/13Peer to Peer Content Delivery Networks35
Anti-snubbingA peer is said to be snubbed if each of its peers chokes itPoor download rates until the optimistic unchoke findsbetter peers.If No data download for over a minute, assume its snubbed.Don’t upload to that peer unless as an optimistic unchoke.More than one concurrent optimistic unchoke – fastrecovery.05/01/13Peer to Peer Content Delivery Networks36
Upload-Only modeOnce download is complete, a peer has no downloadrates to use for comparison nor has any need to use themThe question is, which nodes to upload to?Policy: Upload to those with the best upload rate.This ensures that pieces get replicated faster05/01/13Peer to Peer Content Delivery Networks37
Pros and cons of BitTorrentProsProficient in utilizing partially downloaded filesDiscourages “freeloading”By rewarding fastest uploadersNo infrastructure costsBetter resource utilizationWorks well for “hot content”05/01/13Peer to Peer Content Delivery Networks38
Pros and cons of BitTorrentConsLong tail doesn’t workEven worse: no trackers for obscure contentSingle point of failure: New nodes can’t enter swarm if trackergoes downLack of a search feature Users need to resort to out-of-band search: well known torrent-hosting sites / plain old web-search05/01/13Peer to Peer Content Delivery Networks39
AnalysisRandom neighbor selection high cross-trafficISP Perspective: Different links have different costsP2P Applications Perspective: No knowledge of underlyingISP topologyNo longer optimal if nodes should connect only to same ISPnodes.End result: Throttling05/01/13Peer to Peer Content Delivery Networks40
Challenges/Open questionsNetwork-Friendly Bit torrent: ISPs informs Bit-torrent of itslink preferences.Biased Neighbor selectionRarest Piece First suffersMove from TCP-UDP: take control of the internet ?Legal Complexity05/01/13Peer to Peer Content Delivery Networks41
SummaryP2P CDNs can becost-effectiveProvide better resource utilizationChallenges:Network CongestionNetwork cost–Friendly ProtocolsHandling copyright issues05/01/13Peer to Peer Content Delivery Networks42
Thank You05/01/13Peer to Peer Content Delivery Networks43