Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Implicit group messaging in peer-to-peer networks Daniel Cutting, 28th April 2006 Advanced Networks Research Group
  2. 2. Outline. <ul><li>Motivation and problem </li></ul><ul><li>Implicit groups </li></ul><ul><li>Implicit group messaging (IGM) </li></ul><ul><li>P2P model </li></ul><ul><li>Evaluation </li></ul>
  3. 3. Motivation. <ul><li>It’s now very easy to publish content on the Internet: blogs, podcasts, forums, iPhoto “photocasting”, … </li></ul><ul><li>More and more publishers of niche content </li></ul><ul><li>Social websites like Flickr, YouTube, MySpace, etc. are gateways for connecting publishers and consumers </li></ul><ul><li>Similar capability would also be desirable in P2P </li></ul><ul><ul><li>Collaboration and sharing without central authority </li></ul></ul><ul><ul><li>No reliance on dedicated infrastructure </li></ul></ul><ul><ul><li>No upfront costs, requirements </li></ul></ul>
  4. 4. Problem. <ul><li>As more new niches are created, consumers need to search/filter more to find and collate varied content </li></ul><ul><li>How can we connect many publishers and consumers? </li></ul><ul><li>The publisher already knows the intended audience </li></ul><ul><ul><li>Can often describe the audience in terms of interests </li></ul></ul><ul><ul><li>Does not know the names of individual audience members </li></ul></ul><ul><ul><li>So, address them as an implicit group </li></ul></ul>
  5. 5. Implicit groups. <ul><li>Explicit groups </li></ul><ul><ul><li>Members named </li></ul></ul><ul><ul><li>Pre-defined by publisher or consumers need to join </li></ul></ul><ul><ul><li>Wolfgang, Julie </li></ul></ul><ul><li>Implicit groups </li></ul><ul><ul><li>Members described </li></ul></ul><ul><ul><li>Publisher defines “on the fly”, consumers don’t need to join </li></ul></ul><ul><ul><li>Soccer & Brazil </li></ul></ul>
  6. 6. Implicit group messaging. <ul><li>CAST messages from any source to any implicit group at any time in a P2P network </li></ul><ul><ul><li>Each peer described by attributes (capabilities, interests, services, …), e.g. “Soccer”, “Brazil” </li></ul></ul><ul><ul><li>Implicit groups are specified as logical expressions of attributes, e.g. “(Soccer OR Football) AND Brazil” </li></ul></ul><ul><ul><li>System delivers messages from sources to all peers matching target expressions </li></ul></ul>
  7. 7. P2P model. <ul><li>A fully distributed, structured overlay network </li></ul><ul><ul><li>Peers maintain a logical Cartesian surface (like CAN) </li></ul></ul><ul><ul><li>Each peer owns part of the surface and knows neighbours </li></ul></ul><ul><ul><li>Peers store data hashed to their part of the surface </li></ul></ul><ul><ul><li>Peers geometrically ROUTE to locations by passing from neighbour to neighbour </li></ul></ul><ul><ul><li>Quadtree-based surface addressing </li></ul></ul><ul><li>Smoothly combine two major techniques for efficient CAST delivery to groups of any size </li></ul>
  8. 8. P2P model. <ul><li>Attribute partitioning: “attribute  peer” index for small groups </li></ul><ul><li>Summary hashing : for reaching BIG groups </li></ul><ul><li>Hybrid CAST algorithm : reactive multicast algorithm combining the above </li></ul>
  9. 9. Quadtree-based addressing. <ul><li>Surfaces can be any dimensionality d </li></ul><ul><li>An address is a string of digits of base 2 d </li></ul><ul><li>Map from an address to the surface using a quadtree decomposition </li></ul><ul><li>Quadrants called extents </li></ul>
  10. 10. Attribute partitioning. <ul><li>A distributed index from each attribute to all peers </li></ul><ul><li>Indices are stored at rendezvous points (RPs) on the surface by hashing the attribute to an address </li></ul>
  11. 11. Attribute partitioning (registration). <ul><li>Every peer registers at each of its attributes RPs </li></ul><ul><li>Every registration includes IP address and all attributes </li></ul>
  12. 12. Attribute partitioning (CASTing). <ul><li>To CAST, select one term from target </li></ul><ul><li>Route CAST to its RP </li></ul><ul><li>RP finds all matches and unicasts to each </li></ul>
  13. 13. Attribute partitioning. <ul><li>Simple, works well for small groups and rare attributes </li></ul><ul><ul><li>Fast: just one overlay route followed by unicasts </li></ul></ul><ul><ul><li>Fair: each peer responsible for similar number of attributes </li></ul></ul><ul><li>BUT common attribute  lots of registrations at one RP </li></ul><ul><ul><li>Heavy registration load on some unlucky peers </li></ul></ul><ul><li>ALSO big groups  many identical unicasts required </li></ul><ul><ul><li>Heavy link stress around RPs </li></ul></ul><ul><li>SO , in these cases share the load with your peers! </li></ul>
  14. 14. Summary hashing. <ul><li>Spreads registration and delivery load over many peers </li></ul><ul><li>In addition to attribute registrations, each peer stores a back-pointer and a summary of their attributes at one other location on the surface </li></ul><ul><li>Location of summary encodes its attributes </li></ul><ul><li>Given a target expression, any peer can calculate all possible locations of matching summaries (and thus find pointers to all group members) </li></ul><ul><li>Summaries distributed over surface; a few at each peer </li></ul>
  15. 15. Summary hashing (registration). <ul><li>Each peer creates a Bloom Filter </li></ul><ul><ul><li>{Soccer,Brazil}  01101 01100 | 01001 </li></ul></ul><ul><li>Treat bits as an address </li></ul><ul><ul><li>01101(0)  122 (2D) </li></ul></ul><ul><li>Store summary at that address on the surface </li></ul>Wolfgang {Soccer, Brazil} Benoit {Argentina, Soccer} Kim {Brazil} Julie {Soccer, Argentina, Brazil}
  16. 16. Summary hashing (CASTing). <ul><li>Can find all summaries matching a CAST by calculating all possible extents where they must be stored </li></ul><ul><li>Convert CAST to Bloom Filter, replace 0s with wildcards </li></ul><ul><ul><li>Soccer & Brazil  {Soccer, Brazil}  *11*1 </li></ul></ul><ul><ul><li>01100 | 01001 </li></ul></ul><ul><li>Any peer with both attributes must have (at least) the 2nd, 3rd and 5th bits set in their summary address </li></ul><ul><ul><li>The wildcards may match 1s or 0s depending on what other attributes the peer has </li></ul></ul>
  17. 17. Summary hashing (CASTing). <ul><li>Find extents with 2nd, 3rd and 5th bits are set </li></ul><ul><li>{Soccer,Brazil}  *11*1(*) = { 122, 123, 132, 133, 322, 323, 332, 333 } </li></ul>
  18. 18. Summary hashing (CASTing). <ul><li>Start anywhere and intersect unvisited extents with target expression </li></ul><ul><li>Cluster remainder and forward towards each one until none remain </li></ul><ul><li>When summaries are found, unicast to peers </li></ul><ul><li>Called Directed Amortised Routing (DAR) </li></ul>
  19. 19. IGM on P2P summary. <ul><li>Peers store their summary on the surface and register at the RP for each of their attributes </li></ul><ul><li>If an RP receives too many registrations for a common attribute, it simply drops them </li></ul><ul><li>To CAST, a source peer picks any term from target expression and tries a Partition CAST (through an RP) </li></ul><ul><li>If RP doesn’t know all matching members (because it’s a common attribute) or the group is too large to unicast to each one, it resorts to a DAR </li></ul>
  20. 20. Evaluation. <ul><li>2,000 peer OMNeT++/INET simulation of campus-scale physical networks, 10 attributes per peer (Zipf) </li></ul><ul><li>8,000 random CASTs of various sizes (0 to ~900 members) </li></ul><ul><li>Comparison to a Centralised server model </li></ul><ul><li>Metrics </li></ul><ul><ul><li>Delay penalty </li></ul></ul><ul><ul><li>Peer stress (traffic and storage) </li></ul></ul>
  21. 21. Evaluation (delay penalty). <ul><li>Ratio of Average Delay (RAD) and Ratio Maximum Delay (RMD) compared to Centralised model </li></ul><ul><li>80% of CASTs have average delay less than 6 times Centralised model </li></ul><ul><li>95% have maximum delay less than 6 times Centralised </li></ul>
  22. 22. Evaluation (peer stress). <ul><li>Order of magnitude fewer maximum packets handled by any one peer over the Centralised server Higher average stress since more peers involved in delivering CASTs </li></ul><ul><li>Even spread of registrations over peers </li></ul>
  23. 23. Conclusion <ul><li>Implicit groups are a useful way of addressing a group when you know what they have in common but not who they are </li></ul><ul><li>IGM is also applicable to other applications </li></ul><ul><ul><li>Software updates to those who need them </li></ul></ul><ul><ul><li>Distributed search engines </li></ul></ul><ul><li>P2P implicit group messaging is fast and efficient </li></ul><ul><ul><li>Does not unfairly stress any peers or network links </li></ul></ul><ul><ul><li>Can deliver to arbitrary implicit groups with large size variation </li></ul></ul>
  24. 24. Questions?