Peer-to-Peer Networking


Published on

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Upload index to central server when you come online To search, consult central server Request doc directly
  • Everyone knows about some small number of nodes To find a file, ask everyone you know When you find out who has the doc, ask directly
  • More systematic approach Ids for docs and nodes Store doc at node with closest id Keep track of small number of nodes with ids close to yours Route requests toward the document
  • Most projects address the same goal Slightly different models Some specifics? Main goals, minimize search time and routing state
  • Peer-to-Peer Networking

    1. 1. An Overview of Peer-to-Peer Networking CPSC 441 (with thanks to Sami Rollins, UCSB)
    2. 2. Outline <ul><li>P2P Overview </li></ul><ul><ul><li>What is a peer? </li></ul></ul><ul><ul><li>Example applications </li></ul></ul><ul><ul><li>Benefits of P2P </li></ul></ul><ul><li>P2P Content Sharing </li></ul><ul><ul><li>Challenges </li></ul></ul><ul><ul><li>Group management/data placement approaches </li></ul></ul><ul><ul><li>Measurement studies </li></ul></ul>
    3. 3. What is Peer-to-Peer (P2P)? <ul><li>Napster? </li></ul><ul><li>Gnutella? </li></ul><ul><li>Kazaa? </li></ul><ul><li>Most people think of P2P as music sharing (but it can also be used for good purposes! :) </li></ul>
    4. 4. What is a peer ? <ul><li>Contrast to Client-Server model </li></ul><ul><ul><li>Servers are typically well-resourced, and centrally maintained and administered </li></ul></ul><ul><ul><li>Client has fewer resources than a server </li></ul></ul><ul><li>P2P: nodes are “equals” </li></ul>
    5. 5. What is a peer ? (cont’d) <ul><li>A peer’s resources are similar to those of the other participants </li></ul><ul><li>P2P peers communicate directly with each other and share resources </li></ul><ul><li>Typically at App Layer (ignorant of physical network topology) </li></ul>
    6. 6. P2P Goals/Benefits <ul><li>Cost sharing </li></ul><ul><li>Resource aggregation </li></ul><ul><li>Improved scalability/reliability </li></ul><ul><li>Increased autonomy </li></ul><ul><li>Anonymity/privacy </li></ul><ul><li>Ad-hoc communication </li></ul>
    7. 7. P2P Application Taxonomy P2P Systems Distributed Computing [email_address] File Sharing Gnutella Collaboration Jabber Platforms JXTA
    8. 8. P2P File Sharing Approaches <ul><li>Centralized </li></ul><ul><li>Flooding </li></ul><ul><li>Document Routing </li></ul>
    9. 9. Centralized <ul><li>Napster model </li></ul><ul><li>Benefits: </li></ul><ul><ul><li>Efficient search </li></ul></ul><ul><ul><li>Limited bandwidth usage </li></ul></ul><ul><ul><li>No per-node state </li></ul></ul><ul><li>Drawbacks: </li></ul><ul><ul><li>Central point of failure </li></ul></ul><ul><ul><li>Limited scale </li></ul></ul><ul><ul><li>Copyright/legal issues </li></ul></ul>Bob Alice Jane Judy
    10. 10. Flooding <ul><li>Gnutella model </li></ul><ul><li>Benefits: </li></ul><ul><ul><li>No central point of failure </li></ul></ul><ul><ul><li>Limited per-node state </li></ul></ul><ul><li>Drawbacks: </li></ul><ul><ul><li>Slow searches </li></ul></ul><ul><ul><li>Bandwidth intensive </li></ul></ul>Bob Alice Jane Judy Carl
    11. 11. Document Routing <ul><li>FreeNet, Chord, CAN, Tapestry, Pastry model </li></ul><ul><li>Benefits: </li></ul><ul><ul><li>More efficient searching </li></ul></ul><ul><ul><li>Limited per-node state </li></ul></ul><ul><li>Drawbacks: </li></ul><ul><ul><li>Limited fault-tolerance vs redundancy </li></ul></ul>001 012 212 305 332 212 ? 212 ?
    12. 12. Current Research <ul><li>Peer discovery, group management, data location and placement </li></ul><ul><ul><li>Chord, CAN, Tapestry, Pastry </li></ul></ul><ul><li>Security, privacy, anonymity, trust </li></ul><ul><ul><li>Publius, FreeNet </li></ul></ul><ul><li>Reliable, efficient file exchange </li></ul><ul><li>Performance studies </li></ul><ul><ul><li>Gnutella measurement study </li></ul></ul>
    13. 13. Management/Placement Challenges <ul><li>Per-node state </li></ul><ul><li>Bandwidth usage </li></ul><ul><li>Search time </li></ul><ul><li>Fault tolerance/resiliency </li></ul>
    14. 14. Document Routing – Chord <ul><li>MIT project </li></ul><ul><li>Uni-dimensional ID space </li></ul><ul><li>Keep track of log N nodes </li></ul><ul><li>Search through log N nodes to find desired key </li></ul>N32 N10 N5 N20 N110 N99 N80 N60 K19
    15. 15. Cost Comparisons log b N Neighbor map Pastry b log b N log b N Global Mesh Tapestry 2d dN 1/d Multi-dimensional CAN log N log N Uni-dimensional Chord State Search Model b log b N + b
    16. 16. Remaining Problems? <ul><li>Hard to handle highly dynamic environments </li></ul><ul><li>Usable services </li></ul><ul><li>Methods don’t consider peer characteristics </li></ul>
    17. 17. Measurement Studies <ul><li>“ Free riding” on Gnutella </li></ul><ul><li>Most studies focus on Gnutella </li></ul><ul><li>Want to determine how users behave </li></ul><ul><li>Low success rates for transfers (30%?) </li></ul><ul><li>Recommendations for the best way to design systems </li></ul>
    18. 18. Free Riding Results <ul><li>Who is sharing what? </li></ul><ul><li>August 2000 </li></ul>70% 2,182,087 1,667 hosts (5%) 37% 1,142,645 333 hosts (1%) 87% 2,692,082 3,334 hosts (10%) 99% 3,082,572 8,333 hosts (25%) 98% 3,037,232 6,667 hosts (20%) 94% 2,928,905 5,000 hosts (15%) As percent of whole Share The top
    19. 19. Saroiu et al Study <ul><li>May 2001 </li></ul><ul><li>Napster crawl </li></ul><ul><ul><li>query index server and keep track of results </li></ul></ul><ul><ul><li>query about returned peers </li></ul></ul><ul><ul><li>don’t capture users sharing unpopular content </li></ul></ul><ul><li>Gnutella crawl </li></ul><ul><ul><li>send out ping messages with large TTL </li></ul></ul>
    20. 20. Results Overview <ul><li>Lots of heterogeneity between peers </li></ul><ul><ul><li>Systems should consider peer capabilities </li></ul></ul><ul><li>Peers don’t always tell the truth! </li></ul><ul><ul><li>Systems must be able to verify reported peer capabilities or measure true capabilities </li></ul></ul>
    21. 21. Reported Bandwidth
    22. 22. Measured Bandwidth
    23. 23. Measured Latency
    24. 24. Connectivity
    25. 25. Conclusion <ul><li>P2P is an interesting and useful model </li></ul><ul><li>Soon will be the dominant part of Internet traffic volume (if it isn’t already!!) </li></ul><ul><li>There are lots of technical challenges to be solved (scalability, security, caching, …) </li></ul>