Your SlideShare is downloading. ×



Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Implicit group messaging in peer-to-peer networks Daniel Cutting, 28th April 2006 Advanced Networks Research Group
  • 2. Outline.
    • Motivation and problem
    • Implicit groups
    • Implicit group messaging (IGM)
    • P2P model
    • Evaluation
  • 3. Motivation.
    • It’s now very easy to publish content on the Internet: blogs, podcasts, forums, iPhoto “photocasting”, …
    • More and more publishers of niche content
    • Social websites like Flickr, YouTube, MySpace, etc. are gateways for connecting publishers and consumers
    • Similar capability would also be desirable in P2P
      • Collaboration and sharing without central authority
      • No reliance on dedicated infrastructure
      • No upfront costs, requirements
  • 4. Problem.
    • As more new niches are created, consumers need to search/filter more to find and collate varied content
    • How can we connect many publishers and consumers?
    • The publisher already knows the intended audience
      • Can often describe the audience in terms of interests
      • Does not know the names of individual audience members
      • So, address them as an implicit group
  • 5. Implicit groups.
    • Explicit groups
      • Members named
      • Pre-defined by publisher or consumers need to join
      • Wolfgang, Julie
    • Implicit groups
      • Members described
      • Publisher defines “on the fly”, consumers don’t need to join
      • Soccer & Brazil
  • 6. Implicit group messaging.
    • CAST messages from any source to any implicit group at any time in a P2P network
      • Each peer described by attributes (capabilities, interests, services, …), e.g. “Soccer”, “Brazil”
      • Implicit groups are specified as logical expressions of attributes, e.g. “(Soccer OR Football) AND Brazil”
      • System delivers messages from sources to all peers matching target expressions
  • 7. P2P model.
    • A fully distributed, structured overlay network
      • Peers maintain a logical Cartesian surface (like CAN)
      • Each peer owns part of the surface and knows neighbours
      • Peers store data hashed to their part of the surface
      • Peers geometrically ROUTE to locations by passing from neighbour to neighbour
      • Quadtree-based surface addressing
    • Smoothly combine two major techniques for efficient CAST delivery to groups of any size
  • 8. P2P model.
    • Attribute partitioning: “attribute  peer” index for small groups
    • Summary hashing : for reaching BIG groups
    • Hybrid CAST algorithm : reactive multicast algorithm combining the above
  • 9. Quadtree-based addressing.
    • Surfaces can be any dimensionality d
    • An address is a string of digits of base 2 d
    • Map from an address to the surface using a quadtree decomposition
    • Quadrants called extents
  • 10. Attribute partitioning.
    • A distributed index from each attribute to all peers
    • Indices are stored at rendezvous points (RPs) on the surface by hashing the attribute to an address
  • 11. Attribute partitioning (registration).
    • Every peer registers at each of its attributes RPs
    • Every registration includes IP address and all attributes
  • 12. Attribute partitioning (CASTing).
    • To CAST, select one term from target
    • Route CAST to its RP
    • RP finds all matches and unicasts to each
  • 13. Attribute partitioning.
    • Simple, works well for small groups and rare attributes
      • Fast: just one overlay route followed by unicasts
      • Fair: each peer responsible for similar number of attributes
    • BUT common attribute  lots of registrations at one RP
      • Heavy registration load on some unlucky peers
    • ALSO big groups  many identical unicasts required
      • Heavy link stress around RPs
    • SO , in these cases share the load with your peers!
  • 14. Summary hashing.
    • Spreads registration and delivery load over many peers
    • In addition to attribute registrations, each peer stores a back-pointer and a summary of their attributes at one other location on the surface
    • Location of summary encodes its attributes
    • Given a target expression, any peer can calculate all possible locations of matching summaries (and thus find pointers to all group members)
    • Summaries distributed over surface; a few at each peer
  • 15. Summary hashing (registration).
    • Each peer creates a Bloom Filter
      • {Soccer,Brazil}  01101 01100 | 01001
    • Treat bits as an address
      • 01101(0)  122 (2D)
    • Store summary at that address on the surface
    Wolfgang {Soccer, Brazil} Benoit {Argentina, Soccer} Kim {Brazil} Julie {Soccer, Argentina, Brazil}
  • 16. Summary hashing (CASTing).
    • Can find all summaries matching a CAST by calculating all possible extents where they must be stored
    • Convert CAST to Bloom Filter, replace 0s with wildcards
      • Soccer & Brazil  {Soccer, Brazil}  *11*1
      • 01100 | 01001
    • Any peer with both attributes must have (at least) the 2nd, 3rd and 5th bits set in their summary address
      • The wildcards may match 1s or 0s depending on what other attributes the peer has
  • 17. Summary hashing (CASTing).
    • Find extents with 2nd, 3rd and 5th bits are set
    • {Soccer,Brazil}  *11*1(*) = { 122, 123, 132, 133, 322, 323, 332, 333 }
  • 18. Summary hashing (CASTing).
    • Start anywhere and intersect unvisited extents with target expression
    • Cluster remainder and forward towards each one until none remain
    • When summaries are found, unicast to peers
    • Called Directed Amortised Routing (DAR)
  • 19. IGM on P2P summary.
    • Peers store their summary on the surface and register at the RP for each of their attributes
    • If an RP receives too many registrations for a common attribute, it simply drops them
    • To CAST, a source peer picks any term from target expression and tries a Partition CAST (through an RP)
    • If RP doesn’t know all matching members (because it’s a common attribute) or the group is too large to unicast to each one, it resorts to a DAR
  • 20. Evaluation.
    • 2,000 peer OMNeT++/INET simulation of campus-scale physical networks, 10 attributes per peer (Zipf)
    • 8,000 random CASTs of various sizes (0 to ~900 members)
    • Comparison to a Centralised server model
    • Metrics
      • Delay penalty
      • Peer stress (traffic and storage)
  • 21. Evaluation (delay penalty).
    • Ratio of Average Delay (RAD) and Ratio Maximum Delay (RMD) compared to Centralised model
    • 80% of CASTs have average delay less than 6 times Centralised model
    • 95% have maximum delay less than 6 times Centralised
  • 22. Evaluation (peer stress).
    • Order of magnitude fewer maximum packets handled by any one peer over the Centralised server Higher average stress since more peers involved in delivering CASTs
    • Even spread of registrations over peers
  • 23. Conclusion
    • Implicit groups are a useful way of addressing a group when you know what they have in common but not who they are
    • IGM is also applicable to other applications
      • Software updates to those who need them
      • Distributed search engines
    • P2P implicit group messaging is fast and efficient
      • Does not unfairly stress any peers or network links
      • Can deliver to arbitrary implicit groups with large size variation
  • 24. Questions?