0
An Overview of Peer-to-Peer         Sami Rollins           11/14/02
Outline• P2P Overview  – What is a peer?  – Example applications  – Benefits of P2P• P2P Content Sharing  – Challenges  – ...
What is Peer-to-Peer (P2P)?• Napster?• Gnutella?• Most people think of P2P as music sharing
What is a peer?• Contrasted with  Client-Server model• Servers are centrally  maintained and  administered• Client has few...
What is a peer?• A peer’s resources are  similar to the  resources of the other  participants• P2P – peers  communicating ...
Levels of P2P-ness• P2P as a mindset  – Slashdot• P2P as a model  – Gnutella• P2P as an implementation choice  – Applicati...
P2P Application Taxonomy                          P2P SystemsDistributed Computing File Sharing   Collaboration   Platform...
P2P Goals/Benefits•   Cost sharing•   Resource aggregation•   Improved scalability/reliability•   Increased autonomy•   An...
P2P File Sharing• Content exchange  – Gnutella• File systems  – Oceanstore• Filtering/mining  – Opencola
P2P File Sharing Benefits•   Cost sharing•   Resource aggregation•   Improved scalability/reliability•   Anonymity/privacy...
Research Areas•   Peer discovery and group management•   Data location and placement•   Reliable and efficient file exchan...
Current Research• Group management and data placement  – Chord, CAN, Tapestry, Pastry• Anonymity  – Publius• Performance s...
Management/Placement Challenges•   Per-node state•   Bandwidth usage•   Search time•   Fault tolerance/resiliency
Approaches• Centralized• Flooding• Document Routing
Centralized                               Bob    Alice• Napster model• Benefits:  – Efficient search  – Limited bandwidth ...
Flooding                                  Carl         Jane• Gnutella model• Benefits:  – No central point of failure  – L...
Document Routing                                   001                 012• FreeNet, Chord, CAN,  Tapestry, Pastry model  ...
Document Routing – CAN• Associate to each node and item a unique id in an  d-dimensional space• Goals   – Scales to hundre...
CAN Example: Two              Dimensional Space• Space divided between nodes                                  7• All nodes...
CAN Example: Two            Dimensional Space• Node n2:(4, 2) joins  space                                  7  is divided...
CAN Example: Two            Dimensional Space• Node n2:(4, 2) joins  space                                  7  is divided...
CAN Example: Two            Dimensional Space• Nodes n4:(5, 5) and n5:(6,6)                                  7  join      ...
CAN Example: Two             Dimensional Space• Nodes: n1:(1, 2); n2:(4,2); n3:                                  7  (3, 5)...
CAN Example: Two            Dimensional Space• Each item is stored by the                                 7  node who owns...
CAN: Query Example• Each node knows its  neighbors in the d-space          7• Forward query to the              6         ...
CAN: Query Example• Each node knows its  neighbors in the d-space          7• Forward query to the              6         ...
CAN: Query Example• Each node knows its  neighbors in the d-space          7• Forward query to the              6         ...
CAN: Query Example• Each node knows its  neighbors in the d-space          7• Forward query to the              6         ...
Node Failure Recovery• Simple failures  – know your neighbor’s neighbors  – when a node fails, one of its neighbors takes ...
Document Routing – Chord                                  N5                                        N10                   ...
Doc Routing – Tapestry/Pastry                                   43F                                                 993   ...
Comparing Guarantees           Model         Search   StateChord         Uni-       log N    log N           dimensional  ...
Remaining Problems?• Hard to handle highly dynamic  environments• Usable services• Methods don’t consider peer characteris...
Measurement Studies•   “Free Riding on Gnutella”•   Most studies focus on Gnutella•   Want to determine how users behave• ...
Free Riding Results• Who is sharing what?• August 2000   The top             Share       As percent of whole   333 hosts (...
Saroiu et al Study• How many peers are server-like…client-  like?  – Bandwidth, latency• Connectivity• Who is sharing what?
Saroiu et al Study• May 2001• Napster crawl  – query index server and keep track of results  – query about returned peers ...
Results Overview• Lots of heterogeneity between peers  – Systems should consider peer capabilities• Peers lie  – Systems m...
Measured Bandwidth
Reported Bandwidth
Measured Latency
Measured Uptime
Number of Shared Files
Connectivity
Points of Discussion• Is it all hype?• Should P2P be a research area?• Do P2P applications/systems have common  research q...
Conclusion• P2P is an interesting and useful model• There are lots of technical challenges to be  solved
Upcoming SlideShare
Loading in...5
×

Peer to-peer

891

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
891
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
48
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Upload index to central server when you come online To search, consult central server Request doc directly
  • Everyone knows about some small number of nodes To find a file, ask everyone you know When you find out who has the doc, ask directly
  • More systematic approach Ids for docs and nodes Store doc at node with closest id Keep track of small number of nodes with ids close to yours Route requests toward the document
  • Most projects address the same goal Slightly different models Some specifics? Main goals, minimize search time and routing state
  • Transcript of "Peer to-peer"

    1. 1. An Overview of Peer-to-Peer Sami Rollins 11/14/02
    2. 2. Outline• P2P Overview – What is a peer? – Example applications – Benefits of P2P• P2P Content Sharing – Challenges – Group management/data placement approaches – Measurement studies
    3. 3. What is Peer-to-Peer (P2P)?• Napster?• Gnutella?• Most people think of P2P as music sharing
    4. 4. What is a peer?• Contrasted with Client-Server model• Servers are centrally maintained and administered• Client has fewer resources than a server
    5. 5. What is a peer?• A peer’s resources are similar to the resources of the other participants• P2P – peers communicating directly with other peers and sharing resources
    6. 6. Levels of P2P-ness• P2P as a mindset – Slashdot• P2P as a model – Gnutella• P2P as an implementation choice – Application-layer multicast• P2P as an inherent property – Ad-hoc networks
    7. 7. P2P Application Taxonomy P2P SystemsDistributed Computing File Sharing Collaboration Platforms SETI@home Gnutella Jabber JXTA
    8. 8. P2P Goals/Benefits• Cost sharing• Resource aggregation• Improved scalability/reliability• Increased autonomy• Anonymity/privacy• Dynamism• Ad-hoc communication
    9. 9. P2P File Sharing• Content exchange – Gnutella• File systems – Oceanstore• Filtering/mining – Opencola
    10. 10. P2P File Sharing Benefits• Cost sharing• Resource aggregation• Improved scalability/reliability• Anonymity/privacy• Dynamism
    11. 11. Research Areas• Peer discovery and group management• Data location and placement• Reliable and efficient file exchange• Security/privacy/anonymity/trust
    12. 12. Current Research• Group management and data placement – Chord, CAN, Tapestry, Pastry• Anonymity – Publius• Performance studies – Gnutella measurement study
    13. 13. Management/Placement Challenges• Per-node state• Bandwidth usage• Search time• Fault tolerance/resiliency
    14. 14. Approaches• Centralized• Flooding• Document Routing
    15. 15. Centralized Bob Alice• Napster model• Benefits: – Efficient search – Limited bandwidth usage – No per-node state• Drawbacks: – Central point of failure Judy Jane – Limited scale
    16. 16. Flooding Carl Jane• Gnutella model• Benefits: – No central point of failure – Limited per-node state• Drawbacks: Bob – Slow searches – Bandwidth intensive Alice Judy
    17. 17. Document Routing 001 012• FreeNet, Chord, CAN, Tapestry, Pastry model 212 ? 212 ?• Benefits: 332 – More efficient searching 212 – Limited per-node state 305• Drawbacks: – Limited fault-tolerance vs redundancy
    18. 18. Document Routing – CAN• Associate to each node and item a unique id in an d-dimensional space• Goals – Scales to hundreds of thousands of nodes – Handles rapid arrival and failure of nodes• Properties – Routing table size O(d) – Guarantees that a file is found in at most d*n1/d steps, where n is the total number of nodes Slide modified from another presentation
    19. 19. CAN Example: Two Dimensional Space• Space divided between nodes 7• All nodes cover the entire 6 space 5• Each node covers either a 4 square or a rectangular area of ratios 1:2 or 2:1 3 n1• Example: 2 – Node n1:(1, 2) first node that 1 joins  cover the entire space 0 0 1 2 3 4 5 6 7 Slide modified from another presentation
    20. 20. CAN Example: Two Dimensional Space• Node n2:(4, 2) joins  space 7 is divided between n1 and n2 6 5 4 3 n1 n2 2 1 0 0 1 2 3 4 5 6 7 Slide modified from another presentation
    21. 21. CAN Example: Two Dimensional Space• Node n2:(4, 2) joins  space 7 is divided between n1 and n2 6 n3 5 4 3 n1 n2 2 1 0 0 1 2 3 4 5 6 7 Slide modified from another presentation
    22. 22. CAN Example: Two Dimensional Space• Nodes n4:(5, 5) and n5:(6,6) 7 join 6 n5 n3 n4 5 4 3 n1 n2 2 1 0 0 1 2 3 4 5 6 7 Slide modified from another presentation
    23. 23. CAN Example: Two Dimensional Space• Nodes: n1:(1, 2); n2:(4,2); n3: 7 (3, 5); n4:(5,5);n5:(6,6) 6 n5• Items: f1:(2,3); f2:(5,1); f3: n3 n4 5 f4 (2,1); f4:(7,5); 4 f1 3 n1 n2 2 f3 1 0 f2 0 1 2 3 4 5 6 7 Slide modified from another presentation
    24. 24. CAN Example: Two Dimensional Space• Each item is stored by the 7 node who owns its mapping in the space 6 n3 n4 n5 5 f4 4 f1 3 n1 n2 2 f3 1 0 f2 0 1 2 3 4 5 6 7 Slide modified from another presentation
    25. 25. CAN: Query Example• Each node knows its neighbors in the d-space 7• Forward query to the 6 n5 n4 neighbor that is closest to the 5 n3 f4 query id 4• Example: assume n1 queries 3 f1 f4 n1 n2 2• Can route around some f3 1 failures f2 0 – some failures require local 0 1 2 3 4 5 6 7 flooding Slide modified from another presentation
    26. 26. CAN: Query Example• Each node knows its neighbors in the d-space 7• Forward query to the 6 n5 n4 neighbor that is closest to the 5 n3 f4 query id 4• Example: assume n1 queries 3 f1 f4 n1 n2 2• Can route around some f3 1 failures f2 0 – some failures require local 0 1 2 3 4 5 6 7 flooding Slide modified from another presentation
    27. 27. CAN: Query Example• Each node knows its neighbors in the d-space 7• Forward query to the 6 n5 n4 neighbor that is closest to the 5 n3 f4 query id 4• Example: assume n1 queries 3 f1 f4 n1 n2 2• Can route around some f3 1 failures f2 0 – some failures require local 0 1 2 3 4 5 6 7 flooding Slide modified from another presentation
    28. 28. CAN: Query Example• Each node knows its neighbors in the d-space 7• Forward query to the 6 n5 n4 neighbor that is closest to the 5 n3 f4 query id 4• Example: assume n1 queries 3 f1 f4 n1 n2 2• Can route around some f3 1 failures f2 0 – some failures require local 0 1 2 3 4 5 6 7 flooding Slide modified from another presentation
    29. 29. Node Failure Recovery• Simple failures – know your neighbor’s neighbors – when a node fails, one of its neighbors takes over its zone• More complex failure modes – simultaneous failure of multiple adjacent nodes – scoped flooding to discover neighbors – hopefully, a rare event Slide modified from another presentation
    30. 30. Document Routing – Chord N5 N10 N110 K19• MIT project N20• Uni-dimensional ID N99 space N32• Keep track of log N nodes N80• Search through log N nodes to find desired key N60
    31. 31. Doc Routing – Tapestry/Pastry 43F 993 13F E E E• Global mesh• Suffix-based routing 73F F99• Uses underlying network E 0 distance in constructing 04F E mesh 999 ABF 0 E 239 E 129 0
    32. 32. Comparing Guarantees Model Search StateChord Uni- log N log N dimensional Multi-CAN dN1/d 2d dimensionalTapestry Global Mesh logbN b logbNPastry Neighbor logbN b logbN + b map
    33. 33. Remaining Problems?• Hard to handle highly dynamic environments• Usable services• Methods don’t consider peer characteristics
    34. 34. Measurement Studies• “Free Riding on Gnutella”• Most studies focus on Gnutella• Want to determine how users behave• Recommendations for the best way to design systems
    35. 35. Free Riding Results• Who is sharing what?• August 2000 The top Share As percent of whole 333 hosts (1%) 1,142,645 37% 1,667 hosts (5%) 2,182,087 70% 3,334 hosts (10%) 2,692,082 87% 5,000 hosts (15%) 2,928,905 94% 6,667 hosts (20%) 3,037,232 98% 8,333 hosts (25%) 3,082,572 99%
    36. 36. Saroiu et al Study• How many peers are server-like…client- like? – Bandwidth, latency• Connectivity• Who is sharing what?
    37. 37. Saroiu et al Study• May 2001• Napster crawl – query index server and keep track of results – query about returned peers – don’t capture users sharing unpopular content• Gnutella crawl – send out ping messages with large TTL
    38. 38. Results Overview• Lots of heterogeneity between peers – Systems should consider peer capabilities• Peers lie – Systems must be able to verify reported peer capabilities or measure true capabilities
    39. 39. Measured Bandwidth
    40. 40. Reported Bandwidth
    41. 41. Measured Latency
    42. 42. Measured Uptime
    43. 43. Number of Shared Files
    44. 44. Connectivity
    45. 45. Points of Discussion• Is it all hype?• Should P2P be a research area?• Do P2P applications/systems have common research questions?• What are the “killer apps” for P2P systems?
    46. 46. Conclusion• P2P is an interesting and useful model• There are lots of technical challenges to be solved
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×