3.
Introduction
one of the original four distributed hash table
proposals, introduced concurrently with
Chords, Pastry, and Tapestry.
developed at UC Berkeley
CANs overlay routing easy to understand
scalable indexing system for large-scale
decentralized storage applications
has a good performance
4.
Overview
distributed, decentralized P2P infrastructure
system that maps keys onto values
keys hashed into d dimensional Cartesian space
Interface:
insert(key, value)
retrieve(key)
associate to each node and item a unique
coordinate in an d-dimensional Cartesian space
5.
Overview
entire space is partitioned amongst all the nodes
every node “owns” a zone in the overall space
can store data at “points” in the space
can route from one “point” to another
point node that owns the enclosing zone
6.
Overview
y
x
State of the system at time t
Node
Resource
Zone
Fig:2 dimensional space with a key mapped to a point (x,y)
8.
Construction
I
Bootstrap
node
new node 1) Discover some node “I” already in CAN
9.
Construction
2) Pick random point in space
I
(x,y)
new node
10.
Construction
(x,y)
3) I routes to (x,y), discovers node J
I
J
new node
11.
Construction
newJ
4) split J’s zone in half… new owns one half
12.
Routing
data stored in the CAN is addressed by name
(i.e. key), not location (i.e. IP address)
have some routing mechanism
A node only maintains state for its immediate
neighboring nodes
13.
Routing
y
Node
M(x,y)
N(x,y)
d-dimensional
space with n zones
where d=2 and n=8
2 zones are
neighbor if d-1 dim
overlap
Algorithm:
Choose the neighbor
nearest to the
destination
M(x,y) Query/
Resource
key
14.
Maintenance
Use zone takeover in case of failure or leaving of
a node
Send your neighbor table update to neighbors to
inform that you are alive at discrete time interval t
If your neighbor does not send alive in time
t, takeover its zone
Zone reassignment is needed
15.
Evaluation
Scalability
-For a uniformly partitioned space with n
nodes and d dimensions:
* per node, number of neighbors is 2d
*average routing path is (d*n1/d)/3
hops (due to Manhattan distance
routing, expected hops in each
dimension is dimension length * 1/3)
*Can scale the network without
increasing per node state
Robustness
-no single point of failure
16.
Future Improvements
Multi-dimension
-increase in dimension reduces path length
Caching and replication techniques for better
performance
Overloading the zone
-increases availability, reduces path
length, reduces per hop latency
Uniform partitioning
-compare the volume of the zone with its
neighbors
-partition the zone having largest volume
Be the first to comment