Your SlideShare is downloading. ×
ppt
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
502
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Thank you for your attention.
  • Transcript

    • 1. Structuring Unstructured Peer-to-Peer Networks Stefan Schmid Roger Wattenhofer D istributed C omputing G roup HiPC 2007 Goa, India
    • 2. Networks… DISTRIBUTED COMPUTING Internet Graph Web Graph Neuron Networks Social Graphs Public Transportation Networks
      • Different properties:
      • Natural vs. Man-made
      • Robustness
      • Diameter
      • Routability
      • ...
    • 3. An Interesting Network: Peer-to-Peer Network
      • Popular Examples:
        • File sharing : BitTorrent, eMule, Kazaa, ...
        • Streaming : Zattoo, Joost, ...
        • Internet telefony : Skype, ...
        • etc.
      • Important: p2p accounts for
      • much Internet traffic today!
      • (source: cachelogic.com )
      • Network of peers, e.g., to share files
      • Desirable properties:
        • Scalability
        • Low degree, low network diameter
        • Fast routing
        • etc.
    • 4. Some Own Applications
      • Wuala online storage system
      • - Student project, start-up, http://wua.la
      • Pulsar streaming
      • - tilllate.com, DJ events, ...; pstreams.com
      • - cheap infrastructure at content provider
      • BitThief BitTorrent downloads
      • Distributed Computations
      • - BOINC client for ECC discrete
      • logarithm challence
      Successful paradigm & technology, but still important research challenges!
    • 5. Structured vs. Unstructured Topologies
      • Old „p2p“ systems such as Napster were based on server
      • - Server stores index: search for contents is simple
      • - Problem: single point of failure
      • - Legacy issues...
      • Unstructured systems, e.g., Gnutella , allow arbitrary topologies
      • and arbitrary data placement
      • - Peers just connect to an arbitrary set of other peers
      • - No single point of failure
      • - But often inefficient: routing based on flooding or random walk
      • Structured systems, e.g., eMule‘s Kad network , give guarantees
      • - Proactive maintenance of topology
      • - Provable network diameter and peer degree
      • - Routing possible, look up, e.g., in log(n) hops
      • (maybe also low stretch)
    • 6. What is „better“?
      • Unstructured systems have less maintenance overhead
      • - Peers can join and leave wherever they want
      • Unstructured systems allow for a richer set of queries
      • - e.g., range queries, Boolean queries
      • Most importantly: despite the interesting properties (and large body of research) of structured networks, today‘s predominant networks are still
      • unstructured (e.g., Gnutella, BitTorrent, etc.)
      Really? Really? Flooding always possible!
      • But unstructured systems often have scalability problems
        • When Napster was unplugged, Gnutella went down.
      Discussion needs to be continued...!
    • 7. Routing in Arbitrary Topologies?
      • How to find a file in an arbitrary network?
      • Option 1: Flooding (up to a certain hop radius r )
        • Robust , but does not scale .
        • Does not find the „needles“, but does a good job finding popular files.
      • Option 2: Random Walks
        • Less messages, but no lookup performance guarantee.
        • Potentially large delay (solution: many parallel „walkers“)
        • Walkers can be lost...
        • Analysis difficult.
        • Again: Good to find popular contents, bad to find needles.
    • 8. Flooding
      • This talk considers search operations by flooding .
      • Efficiency of flooding?
      Very efficient on trees! Many redundant transimissions... Flooding efficiency depends on network topology!
    • 9. Clustella
      • We propose Clustella
      • - a new P2P client for unstructured peer-to-peer systems
      • - based on flooding, but with „ smart neighbor selection “
      • - allows for more efficient flooding !
    • 10. Vision
      • Clustella Vision:
      unstructured p2p network Normal client Clustella client By connecting to peers in far-away parts of the network , small cycles in the topology are avoided, and flooding is more efficient. Not only Clustella clients do benefit, but also all other clients in the network.
    • 11. Flood Coverage
      • Main open question: How to connect to remote peers ?
      • Given a set of potential neighbors, it would be useful to know the hop distance to each of those!
      • Then, we could connect to the one furthest away ...
      • Goal: Maximize flood coverage , i.e., maximize minimum number of nodes reached by a r -hop flooding – locally and despite dynamics
    • 12. Hop-Estimation With Clustering
      • Main idea: Use clustering !
      • - Divide network into different clusters.
      • - Peers in different clusters belong to different network regions and can safely be connected without creating small cycles.
      • How to achieve such a clustering? Introduction of beacons !
      • - Two parameters: radius R d and radius R b (R d < R b )
      • - If a peer has no beacon in R d neighborhood, it becomes a beacon itself.
      • - A peer knows all beacons in its R b neighborhood.
      • - R b roughly equals the flooding radius R
    • 13. Clustella Mechanism (1)
      • One beacon in radius R d
      • Beacon known in radius R b
      • Flooding radius R
      • Beacons append their ID to all packets ( piggy-back )
      • If packet expires before, other peers (here: π ‘‘) forward beacon information
      • Entire Rb neighborhood will know beacon π ‘
      • Peers try to connect to peers which have no beacons in common!
    • 14. Clustella Mechanism (2)
      • Edges are undirected
      • All peers have degree d or d+1
      • If connection is accepted if own degree is d or smaller; otherwise, a neighbor may have an open slot, or a connection is broken down
      • Invariant quickly reestablished!
      • Neighbors of existing neighbor are also good candidates , as they are located in the same network region.
    • 15. Two Challenges
      • Evaluation of current neighbors
        • Existing neighbors are always in the same network region
        • Evaluating their quality and comparing them to alternative neighbors is difficult
        • Include routes in packets ! Exclude beacons known from a neighbor only
      • Dynamics
        • Clustella must be robust to churn, i.e., frequent joins and leaves
        • E.g., node crash : Clustella peer p stores some neighbors for each of its neighbors q ; these neighbors are good candidates as they are in the same network region as q
    • 16. Evaluation
      • Simulation of three different neighbor selection strategies
        • Gnutella-like (unfair?): Peers join at some well-known entry point and ask for their neighbors‘ neighbors until they reach full degree
        • Random walk (more interesting?): Peers find new peers by a random walk of length L
        • Clustella : Peers find new neighbors by exploring the network using a walk of length L and by taking beacon information into account
      • Results
        • Gnutella-like topologies result in very inefficient flooding operations
        • Clustella yields higher flood coverage than random walk
    • 17. Future Work
      • Hierarchical clustering (beacons with different radii)
      • - Already a small hierarchy can yield better flood coverage
      • - However, maintenance of hierarchy can be expensive under churn!
      • - Moreover, fairness must be guaranteed: High-level beacon peers should not
      • have more work to do!
      • Smaller messages
        • Reducing the message sizes for large radii is important!
        • Idea: Use of Bloom filters instead of sending beacon IDs directly
    • 18. Conclusion
      • We believe that structuring topologies can be benefitial to peer-to-peer systems!
      • Clustering with beacons is simple and probably also useful in other applications, e.g., in music graph
      • Implementation must ensure fairness and use small message sizes.
      • A good choice of parameters important for both efficiency and stability.
      • Incorporation into Gnutella ??
    • 19. Thank you. Thank you for your interest.