1 NetSys 2013
Adding Capacity-Aware Storage
Indirection to Homogeneous
Distributed Hash Tables
Philip Wette
Computer Netwo...
What is a Peer-to-Peer network?
Loosely coupled group of equally treated computers (peers)
Peers share their resources
Ser...
Motivation Motivation
Prototype of a Distributed Hash Table
3 NetSys 2013
Why do DHTs need loadbalancing?
Peers and files are equally distributed in the identifier space
⇒ in average, each peer serv...
Why do DHTs need loadbalancing?
Peers and files are equally distributed in the identifier space
⇒ in average, each peer serv...
Goal: Creating a DHT that is aware of both heterogeneous peers and files
Considering
Different capacities of individual pee...
Goal: Creating a DHT that is aware of both heterogeneous peers and files
Considering
Different capacities of individual pee...
1 Motivation
2 Adaptive load balancing for DHTs
Idea
Load balancing
3 Evaluation
Scenarios
Results
Overview Adaptive load ...
A peer can supply files it is not responsible for: Creating mirrors in the
network
Device A
File F
Device B
Device C
capaci...
A peer can supply files it is not responsible for: Creating mirrors in the
network
Device A
File F
Device B
Device C
capaci...
A peer can supply files it is not responsible for: Creating mirrors in the
network
Device A
File F
Device B
Device C
capaci...
A peer can supply files it is not responsible for: Creating mirrors in the
network
Device A
File F
Device B
Device C
capaci...
A peer can supply files it is not responsible for: Creating mirrors in the
network
Device A
File F
Device B
Device C
capaci...
Main Problem:
Given a file, how to find a peer capable of hosting a mirror without getting
overloaded?
getPeer(100 kbps)
Pee...
Main Problem:
Given a file, how to find a peer capable of hosting a mirror without getting
overloaded?
getPeer(100 kbps)
Pee...
Idea: Capacity overlay
Quickly find peers based on their free resource count
Use slightly modified Chord (works with any P2P...
Adaptive-Chord
Chord overlay Capacity overlay
Adaptive-Chord Adaptive load balancing for DHTs
System Architecture
10 NetSy...
1 Motivation
2 Adaptive load balancing for DHTs
Idea
Load balancing
3 Evaluation
Scenarios
Results
Overview Adaptive load ...
overloaded?
hosting
mirrors for other
peers?
enough
free resources to take
back mirror?
remove mirror with
highest load
cr...
1 Motivation
2 Adaptive load balancing for DHTs
Idea
Load balancing
3 Evaluation
Scenarios
Results
Overview Evaluation
13 ...
PeerfactSim.KOM
Discrete P2P-Simulator
Event based
Layers independently configurable
Simulation at packet level
Latency Mod...
Parameters for Simulation
1000 Peers simulated with PeerfactSim.KOM
24 Hour period
10.000 Documents
Uniformly distributed ...
1 Motivation
2 Adaptive load balancing for DHTs
Idea
Load balancing
3 Evaluation
Scenarios
Results
Overview Evaluation
16 ...
Average download duration:
0
200
400
600
800
1000
1200
0 200 400 600 800 1000 1200 1400
Duration[sec]
Simulation time [min...
Success rate:
60
65
70
75
80
85
90
95
100
0 200 400 600 800 1000 1200 1400
Successrate
Simulation time [min]
C hord
Adapti...
Creating more realistic simulation by considering churn
No file replication in Chord: Peer offline = files are no longer avai...
Average download duration:
0
100
200
300
400
500
600
700
800
900
0 200 400 600 800 1000 1200 1400
Duration[sec]
Simulation...
Success rate:
60
65
70
75
80
85
90
95
100
0 200 400 600 800 1000 1200 1400
Successrate
Simulation time [min]
C hord
Adapti...
DHTs are not designed to handle heterogeneous files
Largest amount of load is created by small group of files
Small number o...
23 NetSys 2013
Philip Wette
Sonderforschungsbereich 901
Universität Paderborn
Fürstenallee 11
33102 Paderborn
http://sfb90...
A method for handling non-uniform identifier distribution in Chord:
Thorsten Schütt, Florian Schintke, Alexander Reinefeld:...
Szenario v [kbps] t [s] # O success
busy_unstable_nochurn
Chord 118.6 806.7 158.6 71.8 %
Adaptive-Chord 275.4 85.9 18.2 97...
Upcoming SlideShare
Loading in …5
×

Kalman Graffi - IEEE NetSys 2013 - Adding Capacity-Aware Storage Indirection to Homogeneous Distributed Hash Tables

261 views

Published on

Kalman Graffi - IEEE NetSys 2013 - Adding Capacity-Aware Storage Indirection to Homogeneous Distributed Hash Tables

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
261
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Kalman Graffi - IEEE NetSys 2013 - Adding Capacity-Aware Storage Indirection to Homogeneous Distributed Hash Tables

  1. 1. 1 NetSys 2013 Adding Capacity-Aware Storage Indirection to Homogeneous Distributed Hash Tables Philip Wette Computer Networks Group, University of Paderborn Kalman Graffi Technology of Social Networks Group, University of Düsseldorf 1 NetSys 2013
  2. 2. What is a Peer-to-Peer network? Loosely coupled group of equally treated computers (peers) Peers share their resources Service hosted at home computers Decentralized system ⇒ high availability In this talk: Distributed Hash Tables (DHT) used for file sharing (P2P Online Social Networks; Distributed filesystem) Introduction Motivation Peer-to-Peer Networks 2 NetSys 2013
  3. 3. Motivation Motivation Prototype of a Distributed Hash Table 3 NetSys 2013
  4. 4. Why do DHTs need loadbalancing? Peers and files are equally distributed in the identifier space ⇒ in average, each peer serves an equal amount of files But: File sizes are heterogeneous File popularities are heterogeneous Resources required for hosting a file ∼ file size · file popularity Real World observations: A small group of files responsible for largest amount of load In WWW: file popularity is Zipf distributed (heavy tailed) A small group of peers own largest amount of resources University networks: 1 Gbps upload capacity Motivation Motivation Why Loadbalancing? 4 NetSys 2013
  5. 5. Why do DHTs need loadbalancing? Peers and files are equally distributed in the identifier space ⇒ in average, each peer serves an equal amount of files But: File sizes are heterogeneous File popularities are heterogeneous Resources required for hosting a file ∼ file size · file popularity Real World observations: A small group of files responsible for largest amount of load In WWW: file popularity is Zipf distributed (heavy tailed) A small group of peers own largest amount of resources University networks: 1 Gbps upload capacity Idea Let “strong” peers host “hot” files Motivation Motivation Why Loadbalancing? 4 NetSys 2013
  6. 6. Goal: Creating a DHT that is aware of both heterogeneous peers and files Considering Different capacities of individual peers File sizes and file popularities Changing file popularities over time Motivation Motivation Goal 5 NetSys 2013
  7. 7. Goal: Creating a DHT that is aware of both heterogeneous peers and files Considering Different capacities of individual peers File sizes and file popularities Changing file popularities over time Explained at the showcase of Chord, turning it into Adaptive-Chord Techniques can be applied to any DHT Motivation Motivation Goal 5 NetSys 2013
  8. 8. 1 Motivation 2 Adaptive load balancing for DHTs Idea Load balancing 3 Evaluation Scenarios Results Overview Adaptive load balancing for DHTs 6 NetSys 2013
  9. 9. A peer can supply files it is not responsible for: Creating mirrors in the network Device A File F Device B Device C capacity: 20 kbps load: 90 kbps capacity: 100 kbps Adaptive-Chord Adaptive load balancing for DHTs Idea: Decoupling of Responsibility and Supply 7 NetSys 2013
  10. 10. A peer can supply files it is not responsible for: Creating mirrors in the network Device A File F Device B Device C capacity: 20 kbps load: 90 kbps capacity: 100 kbps Mirror for File F load: 90 kbps Adaptive-Chord Adaptive load balancing for DHTs Idea: Decoupling of Responsibility and Supply 7 NetSys 2013
  11. 11. A peer can supply files it is not responsible for: Creating mirrors in the network Device A File F Device B Device C capacity: 20 kbps load: 90 kbps capacity: 100 kbps Mirror for File F load: 90 kbps get(File F) Adaptive-Chord Adaptive load balancing for DHTs Idea: Decoupling of Responsibility and Supply 7 NetSys 2013
  12. 12. A peer can supply files it is not responsible for: Creating mirrors in the network Device A File F Device B Device C capacity: 20 kbps load: 90 kbps capacity: 100 kbps Mirror for File F load: 90 kbps get(File F) get(FileF) Adaptive-Chord Adaptive load balancing for DHTs Idea: Decoupling of Responsibility and Supply 7 NetSys 2013
  13. 13. A peer can supply files it is not responsible for: Creating mirrors in the network Device A File F Device B Device C capacity: 20 kbps load: 90 kbps capacity: 100 kbps Mirror for File F load: 90 kbps get(File F) get(FileF) FileF Adaptive-Chord Adaptive load balancing for DHTs Idea: Decoupling of Responsibility and Supply 7 NetSys 2013
  14. 14. Main Problem: Given a file, how to find a peer capable of hosting a mirror without getting overloaded? getPeer(100 kbps) Peer p Adaptive-Chord Adaptive load balancing for DHTs Idea: How to find a peer to host a mirror? 8 NetSys 2013
  15. 15. Main Problem: Given a file, how to find a peer capable of hosting a mirror without getting overloaded? getPeer(100 kbps) Peer p Solution: A second P2P to address peers based on their free resources: the Capacity overlay Adaptive-Chord Adaptive load balancing for DHTs Idea: How to find a peer to host a mirror? 8 NetSys 2013
  16. 16. Idea: Capacity overlay Quickly find peers based on their free resource count Use slightly modified Chord (works with any P2P using 1D keyspace) Peer identifier reflects free resource count Sorts peers based on free resource count IDK = (2m −1)·min(1, v M ) 0 1/2 M 1/4 M3/4 M Adaptive-Chord Adaptive load balancing for DHTs Capacity overlay 9 NetSys 2013
  17. 17. Adaptive-Chord Chord overlay Capacity overlay Adaptive-Chord Adaptive load balancing for DHTs System Architecture 10 NetSys 2013
  18. 18. 1 Motivation 2 Adaptive load balancing for DHTs Idea Load balancing 3 Evaluation Scenarios Results Overview Adaptive load balancing for DHTs 11 NetSys 2013
  19. 19. overloaded? hosting mirrors for other peers? enough free resources to take back mirror? remove mirror with highest load create mirror for file with highest load Yes Yes No Yes take back mirror No No Adaptive-Chord Adaptive load balancing for DHTs Load balancing 12 NetSys 2013
  20. 20. 1 Motivation 2 Adaptive load balancing for DHTs Idea Load balancing 3 Evaluation Scenarios Results Overview Evaluation 13 NetSys 2013
  21. 21. PeerfactSim.KOM Discrete P2P-Simulator Event based Layers independently configurable Simulation at packet level Latency Model: Global Network Positioning ICS Bandwidth Model: Last Mile Model http://www.peerfact.org Evaluation Evaluation Simulator: PeerfactSim.KOM 14 NetSys 2013
  22. 22. Parameters for Simulation 1000 Peers simulated with PeerfactSim.KOM 24 Hour period 10.000 Documents Uniformly distributed file sizes 0 – 10 MB Zipf distributed file popularities with / without Churn high / low number of queries (10 min / 30 min inter query time) unstable / stable file popularity patterns (10 min / 30 min pop. change) t40 60 90 IdleJoin Publish Churn Query Evaluation Evaluation Scenarios 15 NetSys 2013
  23. 23. 1 Motivation 2 Adaptive load balancing for DHTs Idea Load balancing 3 Evaluation Scenarios Results Overview Evaluation 16 NetSys 2013
  24. 24. Average download duration: 0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200 1400 Duration[sec] Simulation time [min] C hord Adaptive-C hord Average download up to 9 times faster when using Adaptive-Chord Evaluation Evaluation Szenario: High load - unstable popularities - no churn 17 NetSys 2013
  25. 25. Success rate: 60 65 70 75 80 85 90 95 100 0 200 400 600 800 1000 1200 1400 Successrate Simulation time [min] C hord Adaptive-C hord Overloaded Peers: 0 20 40 60 80 100 120 140 160 180 200 0 200 400 600 800 1000 1200 1400 #Peers Simulation time [min] C hord Adaptive-C hord To model user’s impatience, a download is aborted when it’s data rate falls below 1 kbps Nearly 100% success rate in Adaptive-Chord A peer is overloaded when more than 95% of its resources are consumed Evaluation Evaluation Szenario: High load - unstable popularities - no churn 18 NetSys 2013
  26. 26. Creating more realistic simulation by considering churn No file replication in Chord: Peer offline = files are no longer available 100 200 300 400 500 600 700 800 900 1000 0 200 400 600 800 1000 1200 1400 #Peersonline Simulation time [min] Peers online Evaluation Evaluation Churn 19 NetSys 2013
  27. 27. Average download duration: 0 100 200 300 400 500 600 700 800 900 0 200 400 600 800 1000 1200 1400 Duration[sec] Simulation time [min] C hord Adaptive-C hord Still significantly shorter download durations Average data rates higher for Adaptive-Chord Evaluation Evaluation Szenario: High load - unstable popularities - churn 20 NetSys 2013
  28. 28. Success rate: 60 65 70 75 80 85 90 95 100 0 200 400 600 800 1000 1200 1400 Successrate Simulation time [min] C hord Adaptive-C hord Overloaded peers: 0 10 20 30 40 50 60 70 80 0 200 400 600 800 1000 1200 1400 #Peers Simulation time [min] C hord Adaptive-C hord Higher success rate in Adaptive-Chord Overall load much lower than for nochurn scenarios because of offline peers Evaluation Evaluation Szenario: High load - unstable popularities - churn 21 NetSys 2013
  29. 29. DHTs are not designed to handle heterogeneous files Largest amount of load is created by small group of files Small number of peers own most resources When creating mirrors in a DHT, a function is required to address peers by free capacity We propose a second P2P aside to accomplish such an addressing We propose a simple load balancing technique based on mirrors As a showcase we turned Chord into Adaptive-Chord Simulation showed that, even under churn, Adaptive-Chord lowers download time significantly Conclusion Evaluation 22 NetSys 2013
  30. 30. 23 NetSys 2013 Philip Wette Sonderforschungsbereich 901 Universität Paderborn Fürstenallee 11 33102 Paderborn http://sfb901.uni-paderborn.de/
  31. 31. A method for handling non-uniform identifier distribution in Chord: Thorsten Schütt, Florian Schintke, Alexander Reinefeld: Chord#: Structured Overlay Network for Non-Uniform Load-Distribution, Technical Report, ZIB, Berlin, 2005. Evaluation Evaluation Handling Non-Uniform Identifiers in Chord 24 NetSys 2013
  32. 32. Szenario v [kbps] t [s] # O success busy_unstable_nochurn Chord 118.6 806.7 158.6 71.8 % Adaptive-Chord 275.4 85.9 18.2 97.1 % idle_unstable_nochurn Chord 126.5 619.9 76.4 87.7 % Adaptive-Chord 243.8 72.8 5.3 98.8 % busy_unstable_churn Chord 249.2 514.4 29.4 38.1 % Adaptive-Chord 379.8 186.1 21.8 43.6 % idle_unstable_churn Chord 207.8 370.3 12.7 43.2 % Adaptive-Chord 333.7 159.1 11 46.8 % v data rate t duration of a download # O Number of overloaded peers success success rate Evaluation Evaluation Results 25 NetSys 2013

×