Seminar by:Anand Babuint82657@stud.uni-stuttgart.deInstitute for Parallel and DistributedSystems (IPVS)University of Stutt...
Outline05/01/13Peer to Peer Content Delivery Networks2MotivationTraditional ApproachesP2P ArchitectureTypes of P2PCen...
MotivationMillions of users want to download the same popular hugefiles (for free)E.g:Film, Video and musicMedia conten...
Router“Interested”End-hostSourceClient-Server05/01/13Peer to Peer Content Delivery Networks4
Router“Interested”End-hostSourceClient-ServerOverloaded!05/01/13Peer to Peer Content Delivery Networks5
Router“Interested”End-hostSourceIP multicast05/01/13Peer to Peer Content Delivery Networks6
Router“Interested”End-hostSourceEnd-host based multicast05/01/13Peer to Peer Content Delivery Networks7
End-host based multicast“Single-uploader”  “Multiple-uploaders”Node that has downloaded file will then upload it to oth...
End-host multicast using singletreeSource05/01/13Peer to Peer Content Delivery Networks9
End-host multicast using singletreeSource05/01/13Peer to Peer Content Delivery Networks10
End-host multicast using singletreeSourceSlow data transfer05/01/13Peer to Peer Content Delivery Networks11
Why is P2P CDN important?P2P consumes significant amount of internet traffic todayIn 2004, Total P2P traffic was 60% (So...
Peer-to-Peer System05/01/13Peer to Peer Content Delivery Networks13All nodes are both clientsand serversNo centralized d...
Types of Peer-to-Peer SystemsCentralizedNapsterDecentralized Gnutella Fast-trackStructured Freenet Chord Pastry05...
Napster05/01/13Peer to Peer Content Delivery Networks15Only mp3Peer updates file list and the Napster databaseis updated...
Napster -- continued05/01/13Peer to Peer Content Delivery Networks16Search is centralized and dynamic.File transfer is d...
Gnutella05/01/13Peer to Peer Content Delivery Networks17Share any type of filesDecentralized searchRequest send toneigh...
Gnutella -- continued05/01/13Peer to Peer Content Delivery Networks18Decentralized systemNo Single point of failureLess...
Fast-trackHybrid of centralized Napsters anddecentralized Gnutella.Super Nodes acts as local search server Each super n...
BitTorrent“Pull-based”Each file split into smaller piecesNodes pull desired piecesPieces not downloaded in sequential o...
Basic ComponentsSeedPeer that has the entire fileLeacherPeer that has an incomplete copy of the fileA Torrent filePa...
Data typesAll the data used in Bit-torrent communication is Bencoded.Integer: 2011  Bencoded: i2011eString: “Something...
Contents of .torrent filePiece length – Usually 256 KBPieces: SHA-1 hashes of all piecesSHA-1 hashes of each piece in f...
The big pictureThe big pictureWeb ServerBobTrackerDownloader:ASeeder:BDownloader:CHarry Potter.torrent05/01/13Peer to Peer...
Request and ResponseScrape Requeste.g: http://example.com/scrape.php?info_hash=aaaaaaaaaaaaaaaaaaaa&info_hash=bbbbbbbbbbb...
Request and ResponseAnnounce Request:e.g: http://some.tracker.com:999/announce ?info_hash=12345678901234567890&peer_id=AB...
Peer wire Protocol(TCP)exchange of piecesThe file into several pieces and sub-pieces and aredownloaded from different pe...
Steps in PWP:HandshakingMessage Communication Pipelining Piece selection strategyPeer selection strategyChoking and ...
MessagingInitial handshake message:<pstrlen><pstr><reserved><info_hash><peer_id>An UDP ping request/response.All other ...
PipeliningKeep unfulfilled requests on each connectionTo cut down the round-tripThis scheme has been found to saturate ...
Piece Selectioncritical for performanceIf a bad algorithm is used  all the effort would go waste.Until a piece is asse...
Rarest Piece FirstPolicy: Determine the pieces that are most rare among yourpeers and download those firstThis ensures t...
Random First PieceInitially, a peer has nothing to tradeImportant to get a complete piece ASAPRare pieces are typically...
Endgame ModePolicy: Last blocks trickle slowly in general. To speedthis up , send a request for all the missing blocks to...
ChokingChoking is a temporary refusal to upload; downloading isnormalTit-for-tat strategyPeer A said to choke peer B if...
Anti-snubbingA peer is said to be snubbed if each of its peers chokes itPoor download rates until the optimistic unchoke...
Upload-Only modeOnce download is complete, a peer has no downloadrates to use for comparison nor has any need to use them...
Pros and cons of BitTorrentProsProficient in utilizing partially downloaded filesDiscourages “freeloading”By rewarding...
Pros and cons of BitTorrentConsLong tail doesn’t workEven worse: no trackers for obscure contentSingle point of failur...
AnalysisRandom neighbor selection  high cross-trafficISP Perspective: Different links have different costsP2P Applicat...
Challenges/Open questionsNetwork-Friendly Bit torrent: ISPs informs Bit-torrent of itslink preferences.Biased Neighbor s...
SummaryP2P CDNs can becost-effectiveProvide better resource utilizationChallenges:Network CongestionNetwork cost–Frien...
Thank You05/01/13Peer to Peer Content Delivery Networks43
Upcoming SlideShare
Loading in …5
×

P2p cdn

671 views

Published on

This presentation was presented as a seminar to international masters students to introduce P2P Content Distribution Framework

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
671
On SlideShare
0
From Embeds
0
Number of Embeds
13
Actions
Shares
0
Downloads
46
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • P2P systems are classified under two major categories centralized and decentralized. Example of centralized is Napster in which one server keeps the information of of all the other peers and decentralized is further divided into structured and unstructured. These are categorized under unstructured as they do not follow any structured way for file placement and do not optimize the search algorithm. Due to their unstructured way, they flood the queries in network and increase the network congestion…..whereas in structured they follow particular algorithms to search a file in the network.
  • Napster was the start of P2P and it could share only music files with peers. Every node uploads the list of shared files onto the server and whenever any peer search for a file, the server replies back with the list of nodes containing the file. User connect directly to remote peer and start download. However if the remote peer is behind a firewall, the peer send this information to the server and server forwards this request to the remote peer and then our node waits for the remote peer to connect in order to download the file.
  • Issues with Napster…Since it has a single server maintaining the list, the server is the single point of failure….Hence it is prone to denial of service…………However it ensures correct results till the time server is working properly…..as the list is uploaded directly to the server….. Search is centralized but the file transfer is peer to peer….
  • Gnutella could share any type of files, in opposition to Napster…..The search is decentralized….
  • Since the system is completely decentralized there is no single point of failure…………and this is less prone to denial of service….. However it cannot ensure correct results as one node may have the requested file but before the request reaches this node TTL is over and peer is refused about the file……. It increases network congestion as each time the query is broadcasted to all the neighbors…..
  • It connect different networks together…..each network has a super node that keeps the information of all the shared file by the nodes in that network…….
  • P2p cdn

    1. 1. Seminar by:Anand Babuint82657@stud.uni-stuttgart.deInstitute for Parallel and DistributedSystems (IPVS)University of Stuttgart05/01/13Peer to Peer Content Delivery Networks1Peer-to-PeerContent DeliveryNetwork
    2. 2. Outline05/01/13Peer to Peer Content Delivery Networks2MotivationTraditional ApproachesP2P ArchitectureTypes of P2PCentralizedDecentralized Unstructured StructuredSummaryReferences
    3. 3. MotivationMillions of users want to download the same popular hugefiles (for free)E.g:Film, Video and musicMedia content from BroadcastersPersonal ContentSoftwareInstitutions05/01/13Peer to Peer Content Delivery Networks3
    4. 4. Router“Interested”End-hostSourceClient-Server05/01/13Peer to Peer Content Delivery Networks4
    5. 5. Router“Interested”End-hostSourceClient-ServerOverloaded!05/01/13Peer to Peer Content Delivery Networks5
    6. 6. Router“Interested”End-hostSourceIP multicast05/01/13Peer to Peer Content Delivery Networks6
    7. 7. Router“Interested”End-hostSourceEnd-host based multicast05/01/13Peer to Peer Content Delivery Networks7
    8. 8. End-host based multicast“Single-uploader”  “Multiple-uploaders”Node that has downloaded file will then upload it to othernodes.Uploading costs amortized across all nodesAlso called “Application-level Multicast”Many protocols proposed early this decadeYoid (2000), Narada (2000), Overcast (2000), ALMI (2001)All use single treesProblem with single trees?05/01/13Peer to Peer Content Delivery Networks8
    9. 9. End-host multicast using singletreeSource05/01/13Peer to Peer Content Delivery Networks9
    10. 10. End-host multicast using singletreeSource05/01/13Peer to Peer Content Delivery Networks10
    11. 11. End-host multicast using singletreeSourceSlow data transfer05/01/13Peer to Peer Content Delivery Networks11
    12. 12. Why is P2P CDN important?P2P consumes significant amount of internet traffic todayIn 2004, Total P2P traffic was 60% (Source: Cachelogic)Slightly lower share in 2005 (possibly because of legal action),but still significantBT is the most popular P2P Protocol(30% in 2004)Well-Known BT users:05/01/13Peer to Peer Content Delivery Networks12
    13. 13. Peer-to-Peer System05/01/13Peer to Peer Content Delivery Networks13All nodes are both clientsand serversNo centralized datasourceScalableResistant to Flash crowdsCost Effective
    14. 14. Types of Peer-to-Peer SystemsCentralizedNapsterDecentralized Gnutella Fast-trackStructured Freenet Chord Pastry05/01/13Peer to Peer Content Delivery Networks14
    15. 15. Napster05/01/13Peer to Peer Content Delivery Networks15Only mp3Peer updates file list and the Napster databaseis updated periodically.User sends search request to the serverServer replies with the information of nodescontaining the fileUser connects directly to remote peer andstart download
    16. 16. Napster -- continued05/01/13Peer to Peer Content Delivery Networks16Search is centralized and dynamic.File transfer is direct (Peer to Peer)Pros and Cons:Fast and Efficient and up-to-date(no stale links)Single point of failure
    17. 17. Gnutella05/01/13Peer to Peer Content Delivery Networks17Share any type of filesDecentralized searchRequest send toneighbors(Flooding)Neighbor forwards it to itsneighbors.If TTL is over request isfinished.Users with matching file replies
    18. 18. Gnutella -- continued05/01/13Peer to Peer Content Delivery Networks18Decentralized systemNo Single point of failureLess Prone to denial of serviceFlooding queriesIncrease network congestionSearch only reaches to a subset of peers due toTTL.Compromise in Privacy as peers are able to seesearch queries.
    19. 19. Fast-trackHybrid of centralized Napsters anddecentralized Gnutella.Super Nodes acts as local search server Each super node act as a Napster server for asmall network Super nodes are chosen according to theircapacity and availabilityUser upload the list of shared files toa super-peerSuper nodes exchange the listperiodicallyPeer send the query to super node05/01/13Peer to Peer Content Delivery Networks19
    20. 20. BitTorrent“Pull-based”Each file split into smaller piecesNodes pull desired piecesPieces not downloaded in sequential orderPrevious multicast schemes aimed to support “streaming”; BitTorrent does not“swarming” approachEncourages contribution by all nodes05/01/13Peer to Peer Content Delivery Networks20
    21. 21. Basic ComponentsSeedPeer that has the entire fileLeacherPeer that has an incomplete copy of the fileA Torrent filePassive componentContains meta-data about the file to be downloaded and the peersTypically hosted on a web serverA TrackerCentral componentReturns a random list of peers with state information(Completed orDownloading)05/01/13Peer to Peer Content Delivery Networks21
    22. 22. Data typesAll the data used in Bit-torrent communication is Bencoded.Integer: 2011  Bencoded: i2011eString: “Something” Bencoded: 9: SomethingList: List[0]=1337 List[1]=“DEF” List[2]=“CON” Bencoded:li1337e:3DEF:3CONeDictionary:Dictionary[“uname”]=“hpcbabu”Dictionary[“password”]=“default” Benocded formd5:uname7:hpcbabu8:password7:defaulte05/01/13Peer to Peer Content Delivery Networks22
    23. 23. Contents of .torrent filePiece length – Usually 256 KBPieces: SHA-1 hashes of all piecesSHA-1 hashes of each piece in fileFor reliabilityAnnounce Lists: List of all URL of trackersThe piece length and pieces information are fixed whileannounce lists are dynamic.05/01/13Peer to Peer Content Delivery Networks23
    24. 24. The big pictureThe big pictureWeb ServerBobTrackerDownloader:ASeeder:BDownloader:CHarry Potter.torrent05/01/13Peer to Peer Content Delivery Networks24
    25. 25. Request and ResponseScrape Requeste.g: http://example.com/scrape.php?info_hash=aaaaaaaaaaaaaaaaaaaa&info_hash=bbbbbbbbbbbbbbbbbbbb&info_hash=ccccccccccccccccccccScrape Responsee.g:d5:filesd20:....................d8:completei5e10:downloadedi50e10:incompletei10eeee5 seeders, 10 leechers, and 50 complete downloads05/01/13Peer to Peer Content Delivery Networks25
    26. 26. Request and ResponseAnnounce Request:e.g: http://some.tracker.com:999/announce ?info_hash=12345678901234567890&peer_id=ABCDEFGHIJKLMNOPQRST&ip=255.255.255.255&port=6881&downloaded=0&uploaded=0 &left=98765 &event=startedAnnounce Response:The tracker response is a BEncoded dictionary that has twokeys: interval and peers.05/01/13Peer to Peer Content Delivery Networks26
    27. 27. Peer wire Protocol(TCP)exchange of piecesThe file into several pieces and sub-pieces and aredownloaded from different peers.Each client will need to maintain the state information foreach peers. This list looks likeam_choking: this client is choking the peeram_interested: this client is interested in the peerpeer_choking: peer is choking this clientpeer_interested: peer is interested in this client05/01/13Peer to Peer Content Delivery Networks27
    28. 28. Steps in PWP:HandshakingMessage Communication Pipelining Piece selection strategyPeer selection strategyChoking and optimistic unchokingAnti-snubbingUpload-Only ModeEnd Game Mode05/01/13Peer to Peer Content Delivery Networks28
    29. 29. MessagingInitial handshake message:<pstrlen><pstr><reserved><info_hash><peer_id>An UDP ping request/response.All other messages are sent over TCP and are of the form: <length prefix><message ID><payload>Request:<len=013><id=6><index><begin><length>e.g.: have: <len=0005><id=4><piece index>choke: <len=0001><id=0>bitfield: <len=0001+X><id=5><bitfield>05/01/13Peer to Peer Content Delivery Networks29
    30. 30. PipeliningKeep unfulfilled requests on each connectionTo cut down the round-tripThis scheme has been found to saturate most connections inpracticeExtremely efficient over slow lines.Default - 505/01/13Peer to Peer Content Delivery Networks30
    31. 31. Piece Selectioncritical for performanceIf a bad algorithm is used  all the effort would go waste.Until a piece is assembled, only download sub-pieces for thatpieceThis policy lets complete pieces assemble quickly05/01/13Peer to Peer Content Delivery Networks31
    32. 32. Rarest Piece FirstPolicy: Determine the pieces that are most rare among yourpeers and download those firstThis ensures that the most common pieces are left till theend to downloadRarest first also ensures that a large variety of pieces aredownloaded from the seed05/01/13Peer to Peer Content Delivery Networks32
    33. 33. Random First PieceInitially, a peer has nothing to tradeImportant to get a complete piece ASAPRare pieces are typically available at fewer peers, sodownloading a rare piece initially is not a good ideaPolicy: Select a random piece of the file and download it05/01/13Peer to Peer Content Delivery Networks33
    34. 34. Endgame ModePolicy: Last blocks trickle slowly in general. To speedthis up , send a request for all the missing blocks toevery peer.Send a cancel message to all peers whenever a blockarrives.This ensures that a download doesn’t get preventedfrom completion due to a single peer with a slowtransfer rateSome bandwidth is wasted, but in practice, this is nottoo much.05/01/13Peer to Peer Content Delivery Networks34
    35. 35. ChokingChoking is a temporary refusal to upload; downloading isnormalTit-for-tat strategyPeer A said to choke peer B if it (A) decides not to upload toBEach peer (say A) unchokes a certain number peers at anytime(default – 4)The three with the largest upload rates to AWhere the tit-for-tat comes inAnother randomly chosen (Optimistic Unchoke)To periodically look for better choices05/01/13Peer to Peer Content Delivery Networks35
    36. 36. Anti-snubbingA peer is said to be snubbed if each of its peers chokes itPoor download rates until the optimistic unchoke findsbetter peers.If No data download for over a minute, assume its snubbed.Don’t upload to that peer unless as an optimistic unchoke.More than one concurrent optimistic unchoke – fastrecovery.05/01/13Peer to Peer Content Delivery Networks36
    37. 37. Upload-Only modeOnce download is complete, a peer has no downloadrates to use for comparison nor has any need to use themThe question is, which nodes to upload to?Policy: Upload to those with the best upload rate.This ensures that pieces get replicated faster05/01/13Peer to Peer Content Delivery Networks37
    38. 38. Pros and cons of BitTorrentProsProficient in utilizing partially downloaded filesDiscourages “freeloading”By rewarding fastest uploadersNo infrastructure costsBetter resource utilizationWorks well for “hot content”05/01/13Peer to Peer Content Delivery Networks38
    39. 39. Pros and cons of BitTorrentConsLong tail doesn’t workEven worse: no trackers for obscure contentSingle point of failure: New nodes can’t enter swarm if trackergoes downLack of a search feature Users need to resort to out-of-band search: well known torrent-hosting sites / plain old web-search05/01/13Peer to Peer Content Delivery Networks39
    40. 40. AnalysisRandom neighbor selection  high cross-trafficISP Perspective: Different links have different costsP2P Applications Perspective: No knowledge of underlyingISP topologyNo longer optimal if nodes should connect only to same ISPnodes.End result: Throttling05/01/13Peer to Peer Content Delivery Networks40
    41. 41. Challenges/Open questionsNetwork-Friendly Bit torrent: ISPs informs Bit-torrent of itslink preferences.Biased Neighbor selectionRarest Piece First suffersMove from TCP-UDP: take control of the internet ?Legal Complexity05/01/13Peer to Peer Content Delivery Networks41
    42. 42. SummaryP2P CDNs can becost-effectiveProvide better resource utilizationChallenges:Network CongestionNetwork cost–Friendly ProtocolsHandling copyright issues05/01/13Peer to Peer Content Delivery Networks42
    43. 43. Thank You05/01/13Peer to Peer Content Delivery Networks43

    ×