IEEE ICPADS 2008 - Kalman Graffi - SkyEye.KOM: An Information Management Over-Overlay for Getting the Oracle View on Structured P2P Systems
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

IEEE ICPADS 2008 - Kalman Graffi - SkyEye.KOM: An Information Management Over-Overlay for Getting the Oracle View on Structured P2P Systems

  • 559 views
Uploaded on

In order to ease the development and maintenance of more complex P2P applications, which combine multiple P2P functionality (e.g. streaming and dependable storage), we suggest to extend structured......

In order to ease the development and maintenance of more complex P2P applications, which combine multiple P2P functionality (e.g. streaming and dependable storage), we suggest to extend structured P2P systems with a dedicated information management layer. This layer is meant to generate statistics on the whole P2P system and to enable capacity-based peer search, which helps the individual functionality layers in the P2P application to find suitable peers for layer-specific role assignment. We present in this paper SkyEye.KOM, an information management layer applicable on DHTs, which fulfills these desired functionality. SkyEye.KOM builds an over-overlay, which is scalable by leveraging the underlying DHT, easy to deploy as simple add-on to existing DHTs and efficient as it needs O(log N) hops per query and to place peer-specific information network wide accessible. Evaluation shows that SkyEye.KOM has a good query performance and that the costs for maintaining the over-overlay are very low.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
559
On Slideshare
559
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
4
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Roter Faden: Was ist P2P? Welche QoS Anforderungen bisher gestellt? Evtl. Unklare Details:
  • Roter Faden: Nun Übergang zu unserer Lösung: Eine zusätzliche Schicht auf strukturierte P2P Overlays (die durch die Common API einheitlich angesprochen werden können). Die Query Form die unsere Architektur bietet, wird vorgestellt Evtl. Unklare Details: Common API, Paper von “F. Dabek and B. Zhao and P. Druschel and I. Stoica“ zum Vereinheitlichen der Services von DHTs, wichtig hier: Route(ID, Msg) – mittels der eine Nachricht (Msg) zu einem Peer geroutet werden kann der für eine ID (ID) zuständig ist.
  • Roter Faden: wo ist die EM Architektur (=Information Einsammel Service) einzuordnen: Über der common API (die über den Strukt. Overlays ist), Update-Flüsse: in einem virtuellen Baum hoch, Peers kennen ihre Vater-Knoten (woher: nächste Folie) Evtl. Unklare Details:
  • Roter Faden: Vater-Knoten ist der Peer zuständig für die mittlere ID in einer Domain (=ID Space Intervallabschnitt), Rekursive Domains Supporting peers zum Load Balancing (da sonst schwache Knoten an wichtigen Stellen sitzen könnten) Evtl. Unklare Details: Peer responsible for a specific ID (e.g. middle) is responsible for ID domain: Determinstische Zuweisfunktion, dadurch kann jeder Peer ermitteln, an welchen Peer er ein Update schicken muss (und zwar an den zuständigen für eine bestimmte ObjectID) Durch diese deterministische Funktion, spart man sich die Maintenance Kosten Storyline zur Animation: Zuerst sehen wir einen ID Space auf den die Overlay IDs abgebildet werden. Die Peers sind in diesem ID Space verteilt. Wir betrachten nun die Protokollschritte aus der Sicht eines einzelnen Peers (roter Pfeil), dieser bestimmt zuerst an wen er seine Updates schicken muss. Dazu halbiert er den ID Space jeweils soweit bis er seinen Coordinator identifiziert (die nächste Halbierung würde den Peer sich selbst als Coordinator zuweisen). An den Coordinator wird nun das Update geschickt. Auch der Coordinator hat einen Coordinator eine Ebene höher, an den die Updates weiter propagiert werden. Mit der Zeit baut sich der Baum von unten her auf und wächst zusammen. Die Coordinatoren können sich Supporting Peers zur Ünterstützung auswählen, diese werden anhand ihrer Kapazitäten ausgewählt.
  • Roter Faden: Soll zeigen dass Queries wirklich nützlich sind für eine komplexe Applikation, Query processing: bottom up, bis Anfrage voll beantwortet werden kann. Evtl. Unklare Details:
  • Roter Faden: Einige Qualitätsaspekte der Lösung und ein Bild zur Visualisierung des Baumes Evtl. Unklare Details:

Transcript

  • 1. SkyEye.KOM: An Information Management Over-Overlay for Getting the Oracle View on Structured P2P Systems Kalman Graffi , Aleksandra Kovacevic, Song Xiao and Ralf Steinmetz Underlay: The Internet Structured Overlay: DHT Information Management Over-Overlay Pick good peers ? Does my p2p system work?
  • 2. Outline
    • Motivation
      • Information Management in P2P Systems
      • Example
    • SkyEye.KOM
      • Requirements
      • Architecture
    • Evaluation
      • Performance
      • Costs
    • Conclusion
  • 3. The Peer-to-Peer Paradigm
    • Peer-to-Peer Systems:
      • Users of a system provide the infrastructure
      • Service is provided from users/peers to users/peers
      • Peer-to-Peer overlays:
        • virtual networks, providing new functionality
        • E.g. Distributed Hash Tables, Keyword-based Search
    • Evolution of applications
      • File sharing:
        • No QoS requirements
      • Voice over IP
        • Real-time requirements
      • Video-on-demand
        • Real-time and bandwidth requirements
      • Online community platforms
        • Potential for high user interaction
  • 4. Open Issues in P2P Research
    • Current P2P-applications do not consider sufficiently
      • the heterogeneity of peers
      • the state of the whole P2P-system
    • P2P systems and applications would benefit from
      • monitoring the system
      • generating system statistics
      • determining the state of the P2P system
      • accounting the capacities of peers
    • Possible results of the aforementioned actions
      • Detection of limitations or failures in running P2P system
      • Adaptation of the peers to the current state of the P2P system
  • 5. Example: Let’s Build a new P2P Application!
    • Get ideas, make design, implement it…
    • Cool hot product, lot of features!
      • A lot of modules involved
      • All of them have some requirements, need specific peers
      • See K.Graffi, “A Distributed Platform for Multimedia Communities” at Proc. of IEEE Int. Symposium on Multimedia ’08, 15.-17.Dec.’08
    • Each functional layer requires specific peers
      • Distribute Computing: Give me 4 peers with Java > 6.0 and processor power > 10M flops
      • Streaming: Give me 2 peers with sufficient processing power for media file recoding
      • Replication: Give me 1 peer with a lot of bandwidth and 10 peers with specific life time and storage space
      • Overlay: Give me 3 peers, with a node-degree > 50
      • Network wrapper: Give me 10 peers from my ISP
      • Security: Find 5 trustworthy peers
    • Challenge No.1: Capacity-based Peer Search
    •  For identifying peer capacities in the network we can use
  • 6. Hooray it Runs! Publish and Advertize! P2P P2P Company Give us money © The new P2P product!
  • 7. Deployment and Running a P2P System
    • Challenge No.1: Capacity-based Peer Search
    • More challenges?
    • Is everything running fine?
    • How to debug and to gain insight?
    • Challenge No. 2:
    • Live statistics on P2P systems
    Underlay: The Internet Structured Overlay: DHT P2P App. ? Does my p2p system work?
  • 8. Optimizing the RUNNING P2P System
    • Assume we have statistics on our P2P application
      • We see that something is wrong / can be improved
      • We know which parameters to change
      • We change the parameter somehow (by hand?)
      • How to change parameters in a running system?
    • Challenge No.3: Parameter optimization based on statistics
    Our P2P application
  • 9. Outline
    • Motivation
      • Information Management in P2P Systems
      • Example
    • SkyEye.KOM
      • Requirements
      • Architecture
    • Evaluation
      • Performance
      • Costs
    • Conclusion
  • 10. Requirements on an Information Mgmt. System
    • For all structured P2P overlays
      • Covered by DHT-function: route(msg, key), lookup(key)
      • Usable by all functional layers/modules in the P2P system
    • Generating system statistics
      • System wide average and absolute values, standard deviation, confidence intervals
      • Num. of peers, online times, hops per lookup, hit rate, relative delay penalty, node degree
      • Load (CPU, memory, storage space, bandwidth), additionally weighted with individual peer capabilities
    • Capacity-based peer search
      • Search for a specified number of peers with specific capacities
      • E.g. give me 3 peers with:
        • Storage space > 20Mb
        • Bandwidth > 100kb/s
    • Remote consistent parameter setting
      • Reconfigure all peers during runtime
  • 11. SkyEye: Information Management Over-overlay
    • For all structured P2P overlays
      • Covered by DHT-function: route(msg, key), lookup(key)
      • Usable by all functional layers in the P2P system
    • Features:
      • Overlay-independency
      • Robustness (double-churn)
      • Load-balancing
      • Supporting peer heterogeneity
      • Scalability (# of info and peers)
      • Lightweight
      • Low overhead
    • Function:
      • Generating system statistics
      • Capacity-based peer search
      • Enable parameter setting reconfiguration
    Internet DHT overlay offering route(key,msg,nextHop), resp(key) . API for DHT’s ID space and functions Unified ID space and abstr. functions Coordinator Peer Support Peer
  • 12. SkyEye.KOM Architecture
    • Concept of Over-overlay
      • Built on underlying structured overlay
      • Communicates via common API (KBR)
        • route(msg, key), lookup(key)
      • Unified ID space [0,1] decouples from specific DHT implementation: e.g. divide ID by maxID
    • Information Domains:
      • ID space separated in intervals (domains)
      • Peer responsible for a specific ID (e.g. middle) is responsible for ID domain
      • Peers in the domain send updates to this Coordinator
      • Coordinators may choose Support Peers to share the load/responsibility
        • Coordinators define maximum load to carry
      • Updates propagated upwards the tree
    API for DHT’s ID space and functions Unified ID space and abstr. functions Coordinator Peer Support Peer
  • 13. Example Figures 0 1 1 10 50 20 30 40 45 15 0,09 0,2 0,3 0,4 0,51 0,6 0,75 0,9 CoordinatorID 0,25 CoordinatorID 0,125 CoordinatorID 0,375 CoordinatorID 0,75 0 1 0,18 0,24 0,4 0,55 0,8 CoordinatorID 0,5
  • 14. SkyEye.KOM Architecture Details
    • Update strategy
      • Each node
        • sends information up
        • knows where to send updates to
      • Update consists of:
        • Non-aggregatable peer capacity information
        • Aggregatable systems statistics part
      • Updates are processed before forwarding
      • Update information has a limited lifetime
    • Supporting Peers for Load Balancing
      • Coordinator may chose Supporting Peers
      • Good peers chosen by e.g. 50/50 ratio
        • Pick e.g. 20 best peers in the domain
        • Best 10 peers in domain advertised one level up
        • Second best 10 peers can be used as support
      • Workload can be delegated to supporting peers
      • Tree depth / peer load adjustable
    Coordinator Peer Support Peer For SP: best 11-20 best 1-10 best 1-10 best 1-10 best 1-10 For SP: best 11-20 from below For SP: best 11-20 For SP: best 11-20 from below
  • 15. Setting the Parameters
    • Capturing the state of every peer
      • Periodically measuring and aggregating statistics
      • Forwarding of aggregated statistics
    • Evaluation of the collected information
    • Calculation of system statistics
      • Enables the integration of data mining tools for implementing self-organization
      • Transmission of the new information
    • Adaption to the propagated information
      • Enabling the adaptive behavior of a peer
    Evaluation of data identification of counteraction Measuring and aggregating metrics Measuring and aggregating metrics Measuring and aggregating metrics Measuring and aggregating metrics Measuring and aggregating metrics
  • 16. Queries in the Information Mgmt. System
    • Query Type: Capacity-based Peer Search
      • Give me M peers
      • Fulfilling specific requirements on
        • Bandwidth, storage space, computational capabilities,
        • Online time, peer load, reputation
        • … (wide set of requirements definable)
    • Query processing
      • First sent to coordinator
      • Query traverses bottom-up, until M matching peers found
      • Result is sent then to requesting peer
      • Tradeoff:
        • Upper peers in tree know more
        • Load should be kept on lower levels of the tree
  • 17. Very Simple Tree Maintenance
    • Join, Leave, Repair, Maintenance: Not needed
      • Join: Just send an update to Coordinator
      • After failure of “simple peer”: peer specific information times out
      • After failure of Coordinator / Support Peer:
      • route(msg, key) identifies new Coordinator
        • Coordinator picks a new Support Peer
        • New responsible peer is updated in the next update interval
      • No additional maintenance needed (done by structured overlay)
  • 18. Outline
    • Motivation
      • Information Management in P2P Systems
      • Example
    • SkyEye.KOM
      • Requirements
      • Architecture
    • Evaluation
      • Performance
      • Costs
    • Conclusion
  • 19. Evaluation of SkyEye: Scalability and Information Freshness
    • Scalability
      • Tree-structure of coordinators form information architecture
      • Supp. peers: Strong peers take the load
      • Query performance: O(log N) hops
    • Freshness
      • Freshness tightly related to tree depth
      • Information freshness: O(log N * updateInterval)
    • Conclusion:
      • Information is fresh even under churn
        • Failing peers are quickly replaced
        • It takes one update interval to recover
  • 20. Evaluation of SkyEye: Quality of Information
    • Completeness of Knowledge
      • Peers may define maximum load
        • Some information may be dropped
      • Completeness of knowledge: Ratio of considered peers in the corresp. domain
      • All >90%, some decline at level 9
    • Quality of Information:
      • Peers in the results: >98% are online
      • Less useful peer information is dropped
    • Load on the Peers
      • Constant update load (1 or 3 msg./intvl.)
      • Either just leaf or Coordinator as well
      • Most of the peers are at level 8-9
      • Information density is highest at level 8-9
        • Some information is dropped there due to load limit of peers
  • 21. Evaluation of SkyEye: Query Performance
    • Position of query resolving:
      • Remember: most peers in level 8-9
      • Query takes 2-4 hops to be resolved
      • Most of the queries resolved in level 4-5
      • Leaves room for optimization:
        • Injecting queries lower in the tree
        • Adapting tree height to query load
    • Impact of query complexity
      • Query complexity ranging from 5 to 15
      • Small impact on position of resolving peer
      • Leaves room for optimization:
        • Resolving easier queries at less loaded peers
  • 22. Outline
    • Motivation
      • Information Management in P2P Systems
      • Example
    • SkyEye.KOM
      • Requirements
      • Architecture
    • Evaluation
      • Performance
      • Costs
    • Conclusion
  • 23. Conclusion and Future Work on SkyEye.KOM
    • Some functional requirements and features addressed:
      • Gathering system statistics and capacity-based peer search solved
      • Overlay-independency
        • Unified ID space
        • Usage of DHT-function route(msg, key), lookup(key)
      • Supporting peer heterogeneity
        • Load dispatching to Support Peers
      • Low overhead
        • Relying on robust underlying DHT, no maintenance
    • Room for optimization
      • Load-balancing
        • Several Support Peers per Coordinator: synchronization issues
      • Scalability (# of info and peers)
        • Limitations on the information size
      • Adapting to usage patterns
        • Injects queries and updates on various levels in the tree
  • 24. Questions?