Kalman Graffi - Disputation Talk - Monitoring and Management of P2P Systems - 2010


Published on

This is the disputation talk fo Dr.-Ing. Kalman Graffi - Monitoring and Management of P2P Systems

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Create a new overlay/mechanism for every case?  Reuse existing overlays / mechanisms
  • “ The well-defined and controllable behavior of a system with respect to quantitative parameters ” Does the system meet the expectations? How can we know? Download counter? Application GUI?
  • GRAFIK: CAPACITY reservation Challenges Use cases: various applications, large-scale  general Scale: up to millions of peers  decentralized Environment: peer heterogeneity  fair, efficient Dynamism, unreliability: user behavior, churn  self-organizing
  • Tree topology characteristics All peers part of spanning tree Peer positions calculated Churn-resistant protocols based on IDs
  • | | November 19, 2007
  • Model: behavior synthesis Simulation: large scale, detailed Testbed: accurate prototype
  • MODELL! Log_beta(N) * UI = ???
  • MODELL für Kosten? Warum 3KB/s ?
  • | | November 19, 2007
  • Roter Faden: Nun Übergang zu unserer Lösung: Eine zusätzliche Schicht auf strukturierte P2P Overlays (die durch die Common API einheitlich angesprochen werden können). Die Query Form die unsere Architektur bietet, wird vorgestellt Evtl. Unklare Details: Common API, Paper von “F. Dabek and B. Zhao and P. Druschel and I. Stoica“ zum Vereinheitlichen der Services von DHTs, wichtig hier: Route(ID, Msg) – mittels der eine Nachricht (Msg) zu einem Peer geroutet werden kann der für eine ID (ID) zuständig ist.
  • -Both figures at the top display the amount of solved queries in relation to the level of the initiator and the level of the solver. Concerning the level of the initiators, one can observe, that the amount depends on the number of peers, which are located at that level (many peers at a level result in many originated queries at that level). This does not hold for the resolution of queries, since 87,3% on the centralized DHT overlay (80.6% on the Chord overlay) of solved queries are solved from 1,3% of nodes out of 5001 (this comprises the peers between the root and the 5 th level) -Though, some queries are solved beneath the aforementioned region. To characterize the complexity of solved queries in relation to the level, we display the figures at the bottom. For the simulations, whose statistics are depicted at the bottom, we use for every query the same conditions, but with a changing amount of requested peers. The amount of the peers ranges between 10 and 200 peers in steps of 10 peers. The pictures show the average complexity of queries, which were solved at that level. One can draw the conclusion, that the average complexity of solved queries is inversely proportional to the depth of a peer. November 19, 2007 | |
  • Kalman Graffi - Disputation Talk - Monitoring and Management of P2P Systems - 2010

    1. 1. Monitoring and Management of Peer-to-Peer Systems
    2. 2. Agenda <ul><li>Introduction </li></ul><ul><li>Overview on my System Management Approach </li></ul><ul><li>SkyEye.KOM: System Monitoring </li></ul><ul><li>SkyNet.KOM: System Management Cycle </li></ul><ul><li>Conclusions </li></ul>
    3. 3. Communication Paradigms in the Internet <ul><li>Internet application trends </li></ul><ul><ul><li>Large-scale applications </li></ul></ul><ul><ul><li>User-generated content </li></ul></ul><ul><ul><li>User-to-user communication </li></ul></ul><ul><li>Example </li></ul><ul><ul><li>Online social networks </li></ul></ul><ul><ul><li>Access to user profiles and pictures </li></ul></ul><ul><li>Client / server paradigm </li></ul><ul><ul><li>One central service provider </li></ul></ul><ul><ul><li>Good: controlled quality </li></ul></ul><ul><ul><li>Bad: costs </li></ul></ul><ul><li>Peer-to-peer (p2p) paradigm </li></ul><ul><ul><li>Infrastructure hosted by peers </li></ul></ul><ul><ul><li>Good: shared costs </li></ul></ul><ul><ul><li>Bad: quality? </li></ul></ul>
    4. 4. Challenges for Quality of P2P Systems <ul><li>Distributed character </li></ul><ul><ul><li>Distributed solutions and overlays needed </li></ul></ul><ul><li>Undefined scale </li></ul><ul><ul><li>From tens to several millions </li></ul></ul><ul><li>Peer fluctuation (churn) </li></ul><ul><ul><li>Peers join / leave the system autonomously </li></ul></ul><ul><li>Peer heterogeneity </li></ul><ul><ul><li>Varying capacities and connectivity </li></ul></ul><ul><li>Static configurations are insufficient </li></ul><ul><ul><li>Requirements known when system is deployed </li></ul></ul><ul><ul><li>Updates are difficult </li></ul></ul>Mechanisms
    5. 5. Approaches to Address Quality Challenges <ul><li>Building a quality-aware p2p application </li></ul><ul><ul><li>Identify: application scope, scenario, expected dynamics </li></ul></ul><ul><ul><li>Build software </li></ul></ul><ul><ul><ul><li>relying on peer capacities </li></ul></ul></ul><ul><ul><ul><li>to meet (quality) expectations </li></ul></ul></ul><ul><li>1. Simple access to peer capacities in network </li></ul><ul><ul><li>Reliable usage of the capacities </li></ul></ul><ul><ul><li>Support for mechanism design </li></ul></ul><ul><li>2. Observe and control system quality </li></ul><ul><ul><li>Maintain high service quality </li></ul></ul><ul><ul><li>Interest of users and application provider </li></ul></ul>
    6. 6. <ul><li>Goal: General mechanisms for monitoring the system and peer information in p2p systems and managing the quality of the p2p system and capacity reservations </li></ul><ul><li>My contributions </li></ul><ul><ul><li>SkyEye.KOM : Monitoring </li></ul></ul><ul><ul><ul><li>Statistics on system behavior </li></ul></ul></ul><ul><ul><ul><li>Capacity-based peer search </li></ul></ul></ul><ul><ul><li>SkyNet.KOM : Autonomic management cycle </li></ul></ul><ul><ul><li>P 3 R 3 O.KOM : Reliable capacity reservation </li></ul></ul><ul><ul><li>LifeSocial.KOM : Platform for online social networks </li></ul></ul>My Goal and Contributions
    7. 7. Agenda <ul><li>Introduction </li></ul><ul><li>Overview on my System Management Approach </li></ul><ul><li>SkyEye.KOM: System Monitoring </li></ul><ul><li>SkyNet.KOM: System Management Cycle </li></ul><ul><li>Conclusions </li></ul>
    8. 8. System Management Overview <ul><li>Goal </li></ul><ul><ul><li>System automatically adapts to meet predefined target metric intervals </li></ul></ul><ul><li>Traditional approach </li></ul><ul><ul><li>Build mechanism / overlay for </li></ul></ul><ul><ul><ul><li>Specific scenario </li></ul></ul></ul><ul><ul><ul><li>Expected dynamics </li></ul></ul></ul><ul><ul><li>Limited reusabilty </li></ul></ul><ul><li>Approach </li></ul><ul><ul><li>Research on viable concepts </li></ul></ul><ul><ul><li> SkyNet.KOM : Automated self-configuration framework </li></ul></ul><ul><li>Challenges </li></ul><ul><ul><li>Precise monitoring of system behavior </li></ul></ul><ul><ul><li>Quick information transport </li></ul></ul><ul><ul><li>Closing the management cycle </li></ul></ul>Metric Metric Goal
    9. 9. Running Example: Improving the Lookup Delay <ul><li>P2P overlay functions </li></ul><ul><ul><li>Overlay provides ID-based routing </li></ul></ul><ul><ul><li>Peers are responsible for ID ranges </li></ul></ul><ul><ul><li>Used for </li></ul></ul><ul><ul><ul><li>ID-based document storage </li></ul></ul></ul><ul><ul><ul><li>ID-based role assignment </li></ul></ul></ul><ul><li>Classic p2p overlay “Chord” </li></ul><ul><ul><li>Low routing delay desired </li></ul></ul><ul><ul><li>Routing delay influenced by hop count </li></ul></ul><ul><ul><li> Target metric for hop count: [ 7, 10 ] </li></ul></ul><ul><ul><li>Influencing parameter: finger table size </li></ul></ul><ul><li> Approach for adapting hop count to [7,10] </li></ul><ul><ul><li>Monitor metric status </li></ul></ul><ul><ul><li>Analyze derivation from target metric </li></ul></ul><ul><ul><li>Re-configure system if needed </li></ul></ul>100ms 100ms 100ms 2207 2906 3485 2011 1622 1008 709 611 Hop Count Routing
    10. 10. Agenda <ul><li>Introduction </li></ul><ul><li>Overview on my System Management Approach </li></ul><ul><li>SkyEye.KOM: System Monitoring </li></ul><ul><li>SkyNet.KOM: System Management Cycle </li></ul><ul><li>Conclusions </li></ul>
    11. 11. Monitoring Solution: SkyEye.KOM <ul><li>SkyEye.KOM: My monitoring solution for p2p system </li></ul><ul><li>Functional goal </li></ul><ul><ul><li>Monitoring and provision system status </li></ul></ul><ul><ul><li>Provide capacity-based peer search </li></ul></ul><ul><li>Design decision space </li></ul><ul><ul><li>Engineering aim: usability, precision, low costs </li></ul></ul><ul><ul><li>Detailed research on alternatives </li></ul></ul><ul><li>Next: 1. Tree establishment 2. Communication protocol </li></ul>Reactive Pull-based Several Mesh, ring, star, bus Unstructured Integrated Sampling Centralized Alternatives CONE, Li et al. Proactive Update pattern SOMO Push-based Protocol Roaming agents, Hall et al. Complete view Monitoring scope Related work realizing alternative SkyEye.KOM Design decision SDIMS One Tree count Node counting in Chord Tree Topology T-Man, Push-Sum Structured Overlay type Astrolabe, Willow, DASIS New Layer Client / Server approach, SNMP Distributed Paradigm
    12. 12. SkyEye.KOM: Tree Topology <ul><li>Concept of new layer </li></ul><ul><ul><li>Decouples from specific p2p overlay </li></ul></ul><ul><ul><li>Unified ID space [0,1] </li></ul></ul><ul><li>Tree of information domains </li></ul><ul><ul><li>Domain: ID range for monitoring </li></ul></ul><ul><ul><li>Domain size split in β parts per level </li></ul></ul><ul><ul><li>Domain IDs build tree topology </li></ul></ul><ul><li>Peers to Domain ID assignment </li></ul><ul><ul><li>Peer ID p , level l  Domain IDs </li></ul></ul><ul><ul><li>If peer is responsible: position defined </li></ul></ul>Internet 0.5 0.25 0.375 0,3125 0.75 0.875 0.625 0.125 Domain Domain ID 0.3125 1 10 50 20 30 40 45 15 P2P Overlay 0 1 0.09 0.2 0.31 0,4 0.5 0.6 0.75 0.9 0.375 0.25 0.5
    13. 13. SkyEye.KOM: Communication Protocol <ul><li>Gathering global view </li></ul><ul><ul><li>Information type: Lookup time, tree and network structure, overhead (types), … </li></ul></ul><ul><ul><li>Every peer measures local status </li></ul></ul><ul><ul><li>Periodically sent to parent peer </li></ul></ul><ul><ul><ul><li> Update Interval ( UI ) </li></ul></ul></ul><ul><li>Aggregation </li></ul><ul><ul><li>Direct: count, sum, minimum, maximum, sum of squares </li></ul></ul><ul><ul><li>Derived: mean, variance, std. deviation </li></ul></ul><ul><li>Dissemination of global view </li></ul><ul><ul><li>Global view in root </li></ul></ul><ul><ul><li>Every update message is acknowledged </li></ul></ul><ul><ul><li>Contains global view from level above </li></ul></ul><ul><li>Metric update protocol: </li></ul><ul><li>Gather and disseminate time O( log β (N) · UI ) </li></ul>Local measures, (synchronized signal in simulations) Aggregated view Global view
    14. 14. Evaluation of SkyEye.KOM <ul><li>Evaluation goal </li></ul><ul><ul><li>Understanding SkyEye.KOM’s behavior </li></ul></ul><ul><ul><li>Measuring performance and costs </li></ul></ul><ul><li>Evaluation methodology </li></ul><ul><ul><li>Analytical model: </li></ul></ul><ul><ul><ul><li>Parameter study </li></ul></ul></ul><ul><ul><ul><li>Tree characteristics </li></ul></ul></ul><ul><ul><li>Simulation: </li></ul></ul><ul><ul><ul><li>Parameter study </li></ul></ul></ul><ul><ul><ul><li>Detailed single run </li></ul></ul></ul><ul><ul><li>Testbed evaluation: </li></ul></ul><ul><ul><ul><li>Churn behavior </li></ul></ul></ul><ul><ul><ul><li>Accurate costs </li></ul></ul></ul><ul><li>Simulator – PeerfactSim.KOM </li></ul><ul><ul><li>Delay: global network positioning model </li></ul></ul><ul><ul><li>Churn: based on KAD, exponential churn </li></ul></ul><ul><li>Simulation Setup </li></ul><ul><ul><li>N=10,000, exponential and KAD churn </li></ul></ul>
    15. 15. Simulation Run – N=10000, UI=60s, β=4 <ul><li>Precision: synchronized signal </li></ul><ul><ul><li>Sine, periodicity: 1800s </li></ul></ul><ul><ul><li>Shows monitoring error and freshness </li></ul></ul><ul><li>Results </li></ul><ul><ul><li>Precise measurement </li></ul></ul><ul><ul><li>Age of information: ca. 200s </li></ul></ul><ul><li>Costs: traffic overhead </li></ul><ul><ul><li>Shows average load on peers </li></ul></ul><ul><ul><li>Load is well balanced </li></ul></ul><ul><li>Results </li></ul><ul><ul><li>Traffic overhead: 100–140 bytes/s </li></ul></ul><ul><ul><li>Calculation of average is churn-resistant </li></ul></ul>
    16. 16. Testbed Evaluation: Validation of Results <ul><li>Results </li></ul><ul><ul><li>Previous good results validated </li></ul></ul><ul><ul><li>Even 50% peer fail is not crucial </li></ul></ul><ul><ul><li>Short outliers due to joining peers </li></ul></ul><ul><ul><li>3kb/s bandwidth consumption (UI=5s) </li></ul></ul><ul><ul><li>Average calculation not affected by churn </li></ul></ul><ul><li>Testbed evaluation </li></ul><ul><ul><li>Testing behavior under strong churn </li></ul></ul><ul><ul><li>Metrics: peer count and traffic overhead </li></ul></ul><ul><li>Setup: </li></ul><ul><ul><li>N =500 (on 37 PCs), UI =5 sec, β = 2 </li></ul></ul><ul><ul><li>10%, 20%, 50% and random churn </li></ul></ul>10% 20% Random 50% 50%
    17. 17. Agenda <ul><li>Introduction </li></ul><ul><li>Overview on my System Management Approach </li></ul><ul><li>SkyEye.KOM: System Monitoring </li></ul><ul><li>SkyNet.KOM: System Management Cycle </li></ul><ul><li>Conclusions </li></ul>
    18. 18. Planning a new Configuration <ul><li>Analysis and planning step </li></ul><ul><ul><li>Detects metric violation </li></ul></ul><ul><ul><li>Step-wise adaptation of configuration </li></ul></ul><ul><ul><li>Approaching target metric interval </li></ul></ul><ul><li>Changes need time </li></ul><ul><ul><li>Analyze slope of metric history </li></ul></ul><ul><ul><li>Act only if changes settled </li></ul></ul><ul><ul><li>Prevent configuration oscillation </li></ul></ul>? Metric goal Current metric Parameters
    19. 19. Re-Configuration of the System <ul><li>Management protocol </li></ul><ul><ul><li>Gather current system parameters </li></ul></ul><ul><ul><li>Distribute new parameter configuration </li></ul></ul><ul><li>Approach </li></ul><ul><ul><li>Extension of SkyEye.KOM </li></ul></ul><ul><ul><ul><li>Gather statistics on parameter settings </li></ul></ul></ul><ul><ul><li>Root: detects violation, plans new configuration </li></ul></ul><ul><ul><li>New configuration: </li></ul></ul><ul><ul><ul><li>Spread with SkyEye.KOM in ACKs </li></ul></ul></ul><ul><ul><ul><li>Applied by peers locally </li></ul></ul></ul><ul><li>Planning step </li></ul><ul><ul><li>Various approaches possible </li></ul></ul><ul><ul><li>My goal: closing the management cycle </li></ul></ul>Statistics: metrics and paramters Global view on metrics & new parameters ? Local metric statistics Global metric statistics Metric Update Metric ACK Local parameter statistics New parameter settings
    20. 20. Management of the Hop Count in Chord <ul><li>Managing the behavior of Chord </li></ul><ul><ul><li>Main metric: hop count </li></ul></ul><ul><ul><li> Target metric: [ 7, 10 ] </li></ul></ul><ul><ul><li>Parameter: finger table size </li></ul></ul><ul><li>Rules for violation of the goal metric </li></ul><ul><li>Evaluation goal </li></ul><ul><ul><li>Target metrics are reached and held </li></ul></ul><ul><ul><li>Feasibility of management cycle </li></ul></ul><ul><li>Simulation Setup </li></ul><ul><ul><li>N =10000, UI =30s, β =4 </li></ul></ul>2207 2906 3485 2011 1622 1008 709 611 Hop Count Routing + 100 % Hop count too large - 10 % Hop count too small Finger table size Static Rule Approach
    21. 21. Evaluation Results on SkyNet.KOM <ul><ul><li>Target metric approximation from 2 sides </li></ul></ul><ul><ul><li>Corresponding effects on lookup delay </li></ul></ul><ul><li>Results </li></ul><ul><ul><li>Metric interval [7,10] is reached and held </li></ul></ul><ul><ul><li>Management cycle is closed </li></ul></ul>8 100 Hop Count 80 20 Finger Table Size 2 Steps Initial Start: too large 7.1 5.7 Hop Count 117 160 Finger Table Size 3 Steps Initial Start: too small
    22. 22. Agenda <ul><li>Introduction </li></ul><ul><li>Overview on my System Management Approach </li></ul><ul><li>SkyEye.KOM: System Monitoring </li></ul><ul><li>SkyNet.KOM: System Management Cycle </li></ul><ul><li>Conclusions </li></ul>
    23. 23. Summary <ul><li>System management: SkyNet.KOM </li></ul><ul><ul><li>Reach and hold preset metric intervals </li></ul></ul><ul><ul><li>Through monitoring and self-configuration </li></ul></ul><ul><li>System monitoring: SkyEye.KOM </li></ul><ul><ul><li>Global view on statistics of running system </li></ul></ul><ul><ul><li>Precise yet cost effective monitoring </li></ul></ul><ul><li>Resource management </li></ul><ul><ul><li>Convenient access to peer capacities </li></ul></ul><ul><ul><li>Reliable capacity reservation </li></ul></ul><ul><li>Application scenario </li></ul><ul><ul><li>P2P-based online social network </li></ul></ul><ul><ul><li>Demonstration of feasibility </li></ul></ul>
    24. 24. Implications <ul><li>Usage of p2p overlays “off the shelf” </li></ul><ul><ul><li>For various scenarios / environments </li></ul></ul><ul><li>Generalization </li></ul><ul><ul><li>Applicable on any metrics and parameters </li></ul></ul><ul><li>Research on </li></ul><ul><ul><li>Identification and modeling of interdependencies </li></ul></ul><ul><li>Monitoring and management </li></ul><ul><ul><li>Towards p2p as reliable IT architecture </li></ul></ul>
    25. 25. Thank You for Your Attention!
    26. 27. My Goal and Contribution <ul><li>Goal: Monitor and manage the quality of service of the p2p systems </li></ul>Investigation on quality control P 3 R 3 O.KOM LifeSocial.KOM – a P2P-based Secure Online Social Network Example Application P2P platform for app-based application composition with monitored QoS Analytical model, simulations Analytical model, simulations, testbed Evaluation Distributed redundancy control for guaranteed resource provisioning Autonomic self-configuration cycle to reach and hold preset quality goals SkyNet.KOM Management … of peer-specific metrics,  capacity-based peer search … of system-specific metrics,  global view on system status SkyEye.KOM Monitoring Reliable resource reservation Controlled system metrics Monitoring and management of p2p systems Goal
    27. 28. P2P Overlays: Structured and Unstructured Peer ID space Object ID space Structured P2P Overlay Unstructured P2P Overlay q q L L q q L L
    28. 29. Gathering Peer-specific Information <ul><li>Type of information </li></ul><ul><ul><li>Individual Peer ID and peer specific information: </li></ul></ul><ul><ul><ul><li>Free storage space, CPU power, bandwidth capabilities, online time, … </li></ul></ul></ul><ul><ul><ul><li>Responsibility range, node degree, Coordinator ID, … </li></ul></ul></ul><ul><li>Desired query </li></ul><ul><ul><li>Capacity-based peer search: </li></ul></ul><ul><ul><li>Find N peers with e.g. node degree > 20, free storage space > 10MB, online time > 10h </li></ul></ul><ul><li>Design decision: proactive </li></ul><ul><ul><li>Constantly gathering peer information in the tree </li></ul></ul><ul><ul><li>Query directly accesses prepared data </li></ul></ul><ul><ul><li>Better for scenarios with frequent queries </li></ul></ul><ul><li>Challenge: </li></ul><ul><ul><li>Information cannot be aggregated  grows in size </li></ul></ul><ul><ul><li>Costs may overload the Coordinators </li></ul></ul><ul><li>Solution idea: replace weak peers in tree with strong Support Peers </li></ul>
    29. 30. Gathering Peer-specific Information <ul><li>Supporting Peers for Load Balancing </li></ul><ul><ul><li>Each peer defines max. load </li></ul></ul><ul><ul><li>Coordinator may choose strong Supporting Peers </li></ul></ul><ul><ul><li>Workload delegated to supporting peer </li></ul></ul><ul><li>Good peers chosen by 50/50 ratio </li></ul><ul><ul><li>Pick e.g. 10 best peers in the domain </li></ul></ul><ul><ul><li>Best 5 peers advertised one level up </li></ul></ul><ul><ul><li>Second best 5 peers used </li></ul></ul><ul><li>Results </li></ul><ul><ul><li>In a tree with strong peers </li></ul></ul><ul><ul><li>Best peers at the top, carrying most of the load </li></ul></ul><ul><ul><li>No peer is overloaded </li></ul></ul>Unified ID space and abstr. functions Coordinator Peer Support Peer For SP: 2nd best Best peer Best peer best best For SP: 2nd best from below For SP: 2nd best For SP: best peer in the tree
    30. 31. Gathering Peer-specific Information: Protocol <ul><li>Update information: </li></ul><ul><li>Peer 11, RAM = 700MB, Online = 12h </li></ul><ul><li>… </li></ul><ul><li>Query format: </li></ul><ul><li>5_of_ RAM_>_1024_Int,CPU_>2048_Int </li></ul>Threshold 20MB Threshold 15MB Threshold 200MB Threshold 50MB Threshold 150MB SP Threshold 20MB Threshold 15MB Threshold 200MB Threshold 50MB Threshold 150MB SP SP 10MB 20MB 20MB 16MB 15MB 37MB Address of the Support-Peer 10MB 15MB 42MB 20MB 10MB 11MB 10MB 10MB Query Query Match 1 Match 2 Query Match 1 Match 2 Match 3 Query Match 1 Match 2 Match 3 Match 4 Match 5 Query Match 1
    31. 32. SkyEye.KOM: Tree Growth and Depth <ul><li>Logarithmic Tree Depth </li></ul><ul><li>Example tree </li></ul><ul><ul><li>Tree degree (TD) = 2 </li></ul></ul><ul><ul><li>Balanced, if ID space balanced </li></ul></ul><ul><ul><li>Peers may be Coordinators at various levels  not always 2 children </li></ul></ul>
    32. 33. SkyEye.KOM: Peer-specific Information <ul><li>Query – originators and solvers </li></ul><ul><ul><li>Scenario with 5000 peers </li></ul></ul><ul><ul><li>Most peers around level 10 </li></ul></ul><ul><ul><li>Most queries solved between root and peers at level 5 </li></ul></ul><ul><li>Effect of query complexity </li></ul><ul><ul><li>Queries demanding better resources are solved higher in the tree </li></ul></ul><ul><ul><li>“ Good” peers bubble up in the tree </li></ul></ul>
    33. 34. Example: Distributed Cloud <ul><li>Demand/provision of: storage space, comp. power, online time, … </li></ul><ul><li>Idea: Exchange resources with “SLA”s in a decentralized manner </li></ul><ul><li>Parallels to SOA intermediary for services: </li></ul><ul><ul><li>Distributed component “buys” and “resells” services </li></ul></ul> See: K. Graffi, AsKo, NL, RST, “From Cells to Organisms: Long-Term Guarantees on Service Provisioning in P2P Networks” In ACM: 8th ACM SIGAPP International Conference on New Technologies of Distributed Systems (NOTERE '08), June 2008 Name price, e.g. comp. power Request service, e.g. comp. power Name price, e.g. bandwidth Information Architecture Gather information about the system: Who offers & con- sumes what? System state? Build information architecture Request service, e.g. storage space Calculate usage trends and resource availability in the future based on gathered information
    34. 35. Inspiration: Monitoring Trees <ul><li>Topology </li></ul><ul><ul><li>Tree based information architecture </li></ul></ul><ul><ul><li>Tree degree and node count  height </li></ul></ul><ul><ul><ul><li>Here: tree degree = 4 </li></ul></ul></ul><ul><li>Information management </li></ul><ul><ul><li>Information aggregated at each level </li></ul></ul><ul><ul><li>In P2P monitoring: churn and equal roles </li></ul></ul>Students Coordinators Root
    35. 36. Monitoring of P2P Systems <ul><li>System view: </li></ul><ul><ul><li>Global statistics on set of metrics </li></ul></ul><ul><ul><li>Avg., std. dev., sum, min., max. </li></ul></ul><ul><ul><li>Long list of aggregatable metrics </li></ul></ul><ul><li>Example: </li></ul><ul><ul><li>Global statistics on bandwidth usage </li></ul></ul><ul><li>Use: </li></ul><ul><ul><li>Observe quality (performance, costs) of running p2p system </li></ul></ul><ul><li>Peer view: </li></ul><ul><ul><li>Overview on the peers’ capacities </li></ul></ul><ul><ul><li>Peer specific capacity information </li></ul></ul><ul><ul><li>Allows capacity-based peer search </li></ul></ul><ul><li>Example: </li></ul><ul><ul><li>Bandwidth capacities of a single peer </li></ul></ul><ul><li>Use: </li></ul><ul><ul><li>Mechanisms find desired capacities </li></ul></ul>Query for 2 peers: Online time < 60m Peers B, D
    36. 37. SkyEye.KOM: Protocols
    37. 38. Interdependencies: Finger Table Size, Contacts, Hop Count, Lookup Delay <ul><li>Contacts ~~ log(FT Size ) </li></ul><ul><li>Hop Count !~ log Contacts (N) </li></ul><ul><li>Approach: </li></ul><ul><ul><li>Modify FT Size  effect on Contacts </li></ul></ul><ul><ul><li>Modify Contacts  effect on Hop Count </li></ul></ul>
    38. 39. Selected Results from the Testbed Evaluation
    39. 40. Comparison of Benchmark Signals
    40. 41. Modules in SkyEye.KOM
    41. 42. SkyEye.KOM – Variants <ul><li>Variant: synchronized updates </li></ul><ul><li>Idea: </li></ul><ul><ul><li>Loosely synchronize updates messages </li></ul></ul><ul><ul><li>Push results once global view is created </li></ul></ul><ul><ul><li>Tradeoff: freshness vs. costs </li></ul></ul><ul><li>Approach: </li></ul><ul><ul><li>Push with ACKs: synchronization delay </li></ul></ul><ul><ul><li>Peers adapt their update offset </li></ul></ul><ul><ul><li>Updates processed in a row </li></ul></ul><ul><ul><li>Push global view with “special-ACK” </li></ul></ul><ul><li>Variant: smoothing of results </li></ul><ul><li>Idea: </li></ul><ul><ul><li>Eliminate monitoring outliers with history-based smoothing: median / exponential </li></ul></ul><ul><ul><li>Tradeoff: freshness vs. less outliers </li></ul></ul><ul><li>Approach: </li></ul><ul><ul><li>History of measurements: H = {m 1 ,…,m h } </li></ul></ul><ul><ul><li>Final measure s H = smooth(H) </li></ul></ul><ul><ul><li>Median: s H = m (|H|+1)/2 with sorted m i </li></ul></ul><ul><ul><li>Exponential smoothing: s H = α m h + (1- α ) s H-1 </li></ul></ul>
    42. 43. Quality Properties of SkyEye.KOM <ul><li>Applicability </li></ul><ul><ul><li>Applicable on every (KBR-compatible) structured p2p overlay </li></ul></ul><ul><ul><li>Independent of any application </li></ul></ul><ul><ul><li>Unified ID space, using core DHT functions </li></ul></ul><ul><li>Robustness </li></ul><ul><ul><li>Handling Churn, coping with link losses </li></ul></ul><ul><ul><li> If peer fails: automatically replaced in the DHT </li></ul></ul><ul><ul><li> Updates are routed to new peer for aggregation </li></ul></ul><ul><li>Scalability </li></ul><ul><ul><li>Scaling in terms of participating peers and exchanged information </li></ul></ul><ul><ul><li>Low costs per peer: bound tree node degree (1+ β) </li></ul></ul><ul><ul><li>Costs independent of node’s tree position, ~1Kb/update </li></ul></ul><ul><li>Performance </li></ul><ul><ul><li>High precision, low outliers </li></ul></ul><ul><ul><li>Information age limited by tree depth, O(log β (N)) </li></ul></ul><ul><ul><li> Influenced by update period </li></ul></ul><ul><li>Retrievability </li></ul><ul><ul><li>Monitoring all peers in the network </li></ul></ul><ul><ul><li> All peers are in the monitoring tree </li></ul></ul><ul><li>Efficiency </li></ul><ul><ul><li>Lightweight solution, low complexity: easier to use, more robust </li></ul></ul><ul><ul><li>Just two message types: Update, ACK </li></ul></ul><ul><ul><li>No signaling complexity and overhead </li></ul></ul>
    43. 44. Additional Slides on Management
    44. 45. Comparison of C/S Solution with SkyEye.KOM
    45. 46. Overview on Quantity of Monitored Statistics <ul><li>Total metric quantity: </li></ul><ul><ul><li>Simulation: 370+10•Depth </li></ul></ul><ul><ul><li>Prototype: 255+20•Depth </li></ul></ul>
    46. 47. Comparison of Benchmark Signals <ul><li>Benchmarking signals </li></ul><ul><ul><li>Sine and ZigZag, periodicity: 1800s=30m </li></ul></ul><ul><ul><li>Synchronized injection of signals </li></ul></ul><ul><ul><li>Allows to benchmark monitoring precision </li></ul></ul><ul><ul><li>Age of information: around 200s </li></ul></ul><ul><li>Freshness </li></ul><ul><ul><li>= Average age of aggregated information = ½ of time to gather and disseminate = ½ log β (N) · UI = ½ log 4 (10000) · 60s = ½ · 398s = 199s </li></ul></ul><ul><ul><li>Model and simulations match </li></ul></ul>
    47. 48. Parameter Studies: Identification of Tradeoffs <ul><li>Varied parameters </li></ul><ul><ul><li>Update interval UI: 15, 30, 60, 120, 240s </li></ul></ul><ul><ul><li>Branching factor β = 2, 4, 8 </li></ul></ul><ul><ul><li>Number of peers N = 1000, 10000 </li></ul></ul><ul><ul><li>(Mean) costs depend on UI (not β or N) </li></ul></ul><ul><li>Tradeoff </li></ul><ul><ul><li>Monitoring precision (of sums) vs. costs </li></ul></ul><ul><ul><ul><li>Monitoring error depends on UI , β, N </li></ul></ul></ul><ul><ul><ul><li>Low relative error requires more traffic </li></ul></ul></ul><ul><ul><li>Increase effort until precision is sufficient </li></ul></ul>
    48. 49. Management Goals <ul><li>Service quality of the p2p system </li></ul><ul><ul><li>Application-specific metric intervals </li></ul></ul><ul><ul><li>Response time [100ms,400ms] </li></ul></ul><ul><li>Goal </li></ul><ul><ul><li>System automatically adapts to meet predefined Target metrics </li></ul></ul><ul><li>Main idea </li></ul><ul><ul><li>Research on viable approaches </li></ul></ul><ul><ul><li>Autonomic computing for p2p systems </li></ul></ul>System Provider Users Direct setting of configuration Mechanisms on peers Global p2p system Network dynamics Peer heterogeneity Application shift Metric goals Controller Configuration New configuration Information dissemination Monitoring: information gathering Service quality