Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Optimal Scheduling in Peer-to-Peer Networks Lee Center Workshop 5/19/06 Mortada Mehyar (with Prof. Steven Low, Netlab)
  2. 2. Outline <ul><li>Brief description of p2p file sharing and Bittorrent protocol </li></ul><ul><li>Our model for Bittorrent-like file sharing </li></ul><ul><li>Efficiency of scheduling algorithms with respect to different optimality criteria. </li></ul>
  3. 3. About Bittorrent <ul><li>A p2p protocol started ~ 2002 </li></ul><ul><li>The most popular p2p system. It accounts for 35% of all Internet traffic! (according to British Web analysis firm CacheLogic) </li></ul><ul><li>Warner Brothers to distribute films through Bittorrent (May 2006) </li></ul>
  4. 4. Bittorrent Basics <ul><li>Divide file into small pieces (256KB). </li></ul><ul><li>Utilize all peers’ upload capacities </li></ul>server client client client Problem: large file (~GB) and large demand (10s, 1000s or more clients.) It is not feasible to set up infrastructure for traditional client-server download.
  5. 5. Bittorrent schematic Seed (peer with entire file) peer peer peer new peer (with torrent file) tracker
  6. 6. Bittorrent algorithms: who to upload to? <ul><li>Tit-for-tat: upload to peers from which most data downloaded in last 30 seconds (4 peers by default.) </li></ul><ul><li>Therefore: incentive to upload in order to be chosen by other peers! </li></ul>
  7. 7. Bittorrent Algorithms: What piece to send? <ul><li>Rarest-first: upload the piece that is rarest among your neighbors first </li></ul>1 2 1 1 2 3
  8. 8. The ‘Broadcasting Model’ M = 1, N = 7, all upload capacities are 1 piece per unit time t = 0 1 t = 1 1 t = 2 1 1 1 1 1 1 t = 3
  9. 9. Example: M = 2 , N = 3 t = 0 1 2 1 t = 1 1 t = 2 1 1 t = 3 2 2 1 1 2 2 2 1 (rarest first!) 2 2 t = 4
  10. 10. Equal capacities, general M, N <ul><li>Theorem 1: </li></ul><ul><li>There exists a schedule for a server to broadcast M messages to N nodes in M+logN time [Bar-Noy et al, 2000] </li></ul><ul><li>However, it is very difficult to extend the result to networks of different capacities </li></ul>
  11. 11. ‘Uplink Sharing Model’ <ul><li>1 server, N peers with possibly different capacities. </li></ul><ul><li>Suppose upload capacities are the only bottleneck. </li></ul><ul><li>Suppose M >> 1 </li></ul>F
  12. 12. Optimal Last Finish Time <ul><li>Theorem 2: </li></ul><ul><li>the minimal time for all N peers to obtain a file F (optimal last finish time) from a server is </li></ul>where F is the file size and Cs, C1,…,CN are the upload capacities. There always exists a schedule S 0 such that the finish time vector is
  13. 13. Example (Zero Peer Capacities) <ul><li>Suppose all peers have 0 capacity, consider the following two strategies </li></ul><ul><ul><li>Divide capacity equally among peers: </li></ul></ul><ul><ul><li>Upload to peers one by one: </li></ul></ul>The last finish time is the same, but the latter is obviously better! In fact, the latter can be shown to be ‘average finish time’ optimal .
  14. 14. Optimal Average Finish Time (N=3) t1 t2 t3 t1 t2 t3 t1 t2 t3 finish time
  15. 15. Conclusion and Ongoing Work <ul><li>Simple model with rich structure for understanding efficiency of p2p file sharing </li></ul><ul><li>It captures many issues Bittorrent addresses (e.g. favoring fast peers, rarest first policy) </li></ul><ul><li>Lots of questions remain open: </li></ul><ul><ul><li>understanding fairness-efficiency tradeoff </li></ul></ul><ul><ul><li>other kinds of optimality criteria </li></ul></ul>
  16. 16. <ul><li>Netlab’s other research projects </li></ul><ul><ul><li>http://netlab.caltech.edu </li></ul></ul><ul><li>More details about this work </li></ul><ul><ul><li>[email_address] </li></ul></ul>
  17. 17. Thank You!
  18. 18. Backup slides start here
  19. 19. Another way to look at Ts
  20. 20. Previous Bittorrent Modeling Work <ul><li>Qiu & Srikant [Sigcomm’04] </li></ul><ul><ul><li>Predator-prey-like fluid models </li></ul></ul><ul><ul><li>Assumes equal capacities among peers </li></ul></ul><ul><ul><li>Assumes rates of peer joins/leaves and studies equilibrium and stability </li></ul></ul>
  21. 21. Proof of Theorem 2 First notice that the two terms have to be lower bounds of the optimal last finish time So it remains to show that the equality is achievable. Here’s a strategy for that:
  22. 22. Proof of Theorem 2 When the server allocates to peer i: Each peer therefore receives:
  23. 23. Proof of Theorem 2 When the server allocates to peer i: Each peer therefore receives:
  24. 24. Bittorrent Basics <ul><li>Torrent file: </li></ul><ul><ul><li>Meta data about the file: filename, size, author, etc. </li></ul></ul><ul><ul><li>Hash info for each file piece to verify integrity </li></ul></ul><ul><ul><li>Link to centralized tracker </li></ul></ul><ul><ul><li>Published on the Web </li></ul></ul><ul><li>Tracker: </li></ul><ul><ul><li>Keeps track of the IPs of peers </li></ul></ul><ul><ul><li>‘ Bootstraps’ new peers </li></ul></ul><ul><ul><li>Centralized, but does not coordinate data transfer among peers </li></ul></ul>
  25. 25. P2P systems <ul><li>Napster (centralized directory) </li></ul><ul><li>Kazza (semi-decentralized system with super peers) </li></ul><ul><li>Gnutella (e.g. Limewire, Bearshare, decentralized) </li></ul><ul><li>Bittorrent (most popular and successful for distribution of large files) </li></ul>
  26. 26. Another way to look at T L
  27. 27. Non-Zero Peer Capacities If the peer capacities are not all 0, then the “upload one by one” strategy can be shown to result in these finish times: However… this is not average finish time optimal!
  28. 28. Comparing finish time vectors <ul><li>Definition: a finish time vector v1 is strictly better than another finish time vector v2 if no component of v1 is larger, and some component of v1 is smaller than the corresponding component of v2 </li></ul><ul><li>- (2, 3, 3) strictly better than (3, 3, 3) </li></ul><ul><li>- (1, 2, 3) (2, 2, 2) cannot be compared with respect to this </li></ul>
  29. 29. The ‘Broadcasting Model' <ul><li>Assume discrete-time, synchronous system where N nodes have equal upload capacity of 1 “message” per unit time </li></ul><ul><li>Objective: find a schedule such that every node receives all M messages in minimal time </li></ul>
  30. 30. Assumptions reasonable for p2p <ul><li>Size of file pieces (256KB for BT) is usually much smaller than total size of file (~GB). Namely, the number of pieces M >> 1. </li></ul><ul><li>Upload links are usually much slower (e.g. DSL lines), so assume upload capacities are the only bottleneck. </li></ul>