Distributed Hash Table
Upcoming SlideShare
Loading in...5
×
 

Distributed Hash Table

on

  • 2,007 views

 

Statistics

Views

Total Views
2,007
Views on SlideShare
2,003
Embed Views
4

Actions

Likes
3
Downloads
55
Comments
0

1 Embed 4

http://www.slideshare.net 4

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Distributed Hash Table Distributed Hash Table Presentation Transcript

  • Distributed Hash Table
  • Definition
    • It is a class of decentralized distributed system that provide lookup service similar to hash table (key, value pair).
    • Responsibility for maintaining the mapping from keys to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption
  • DHT- Structure
    • Key space partitioning scheme splits keyspace among the participating nodes.
    • An overlay network that connects the nodes, allowing them to find of given key in the keyspace.
    • Hash algorithm (SHA-1).
    • Consistent hashing that provides removal or addition of one node changes only the set of keys owned by the nodes with adjacent IDs, and leaves all other nodes unaffected.
  • How Lookup works in DHT
    • Leaf set is successors and predecessors
      • All that’s needed for correctness
    • Routing table matches successively longer prefixes
    • - All that’s needed for performance.
    Response Lookup ID Source
  • Chord
    • One of the original distributed hash table developed at MIT.
    • Nodes are arranged in circle
    • IDs and key are assigned m-bit identifier using consistent hashing
  • Chord-properties
    • Efficient directory operations
      • Insertion, deletion, lookup
    • Good analysis properties
      • O(logN) routing table size
      • O(logN) logic steps to reach the successor of a key k
    • High maintenance cost
      • Node join/leave induces state change on other nodes
  • Pastry
    • Circular namespace
    • Routing Table:
      • Peer p, ID: IDp
      • For each prefix of IDp, keep a set of peers who shares the prefix and the next digit is different from each other .
    • Routing:
      • Choose a peer whose ID shares the longest prefix with target ID
      • Choose a peer whose ID is numerically closest to target ID
    • Similar analysis properties with Chord
  • The Problem of Churn
    • The continuous process of node arrival and departure in DHTs.
    • One metric of churn is session time- time between when a node joins the network until the next time it leaves. Also consider lifetime and availability.
    • Even temporary loss of routing neighbor weakens the correctness and performance of DHTs.
    • Unavailability of neighbors reduce a node’s effective connectivity, forcing it to choose suboptimal routes and increasing the failures.
  • Experiment results
    • Pastry fails to complete a majority of lookup requests under heavy churn because nodes waits so long on for request messages.
    • Chord performs well and return consistent results under lower churn rates but shortcoming is lookup latency.
    • Under churn DHT
      • may fail to complete lookup request.
      • It will complete but return inconsistent results.
      • It will complete and return consistent results but suffer from dramatic increase in lookup latency.
  • Handling Churn
    • Factors that effect the behavior of DHTs under churn
        • Reactive versus periodic recovery from neighbor failures.
        • Calculation of good timeout values for lookup messages.
        • Techniques to achieve proximity in neighbor selection.
  • Reactive recovery
    • A node reacts to loss of one of its existing leaf set neighbors by sending a copy of its leaf set to every node in it.
    • Total number of messages O(k^2) in
    • k-node network.
    • Without churn it is very efficient, as messages are sent only in response to actual changes.
    • It consumes more bandwidth as churn rates increases.
  • Periodic Recovery
    • Node periodically shares its leaf set with each node in leaf set.
    • Share with one random node makes better improvement.
    • Number of messages exchanged are O(log k).
    • It aggregates all changes in each period into a single message.
    • Positive feedback cycles in reactive recovery.
  • Timeout Calculation
    • A node should ensure that the timeout for a request was judiciously selected before routing to an alternative neighbor.
    • Techniques:
      • TCP-style timeouts. (recursive routing)
      • RTO=AVG+4*VAR.
      • Timeouts from virtual coordinates. (Iterative routing)
      • uses distributed machine learning
      • D(N1,N2) proportional to network latency
      • Both strategies provide similar mean latency at low churn rates.
  • Proximity Neighbor Selection
    • The process of choosing among the potential neighbors for any given routing table entry according to their network latency to the choosing node.
    • Techniques:
    • Neighbors’ neighbors.
    • Neighbors’ inverse neighbors.
  • Proximity neighbor selection
    • Algorithm:
    • - Use the algorithm to find nodes that may be near to the local node.
    • - Measure the latency to those nodes.
    • - If we have no existing neighbor in routing table fill it with new node or if node is already present replace it with the new node.