Distributed Hash Table
Upcoming SlideShare
Loading in...5

Distributed Hash Table






Total Views
Views on SlideShare
Embed Views



1 Embed 4

http://www.slideshare.net 4



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Distributed Hash Table Distributed Hash Table Presentation Transcript

  • Distributed Hash Table
  • Definition
    • It is a class of decentralized distributed system that provide lookup service similar to hash table (key, value pair).
    • Responsibility for maintaining the mapping from keys to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption
  • DHT- Structure
    • Key space partitioning scheme splits keyspace among the participating nodes.
    • An overlay network that connects the nodes, allowing them to find of given key in the keyspace.
    • Hash algorithm (SHA-1).
    • Consistent hashing that provides removal or addition of one node changes only the set of keys owned by the nodes with adjacent IDs, and leaves all other nodes unaffected.
  • How Lookup works in DHT
    • Leaf set is successors and predecessors
      • All that’s needed for correctness
    • Routing table matches successively longer prefixes
    • - All that’s needed for performance.
    Response Lookup ID Source
  • Chord
    • One of the original distributed hash table developed at MIT.
    • Nodes are arranged in circle
    • IDs and key are assigned m-bit identifier using consistent hashing
  • Chord-properties
    • Efficient directory operations
      • Insertion, deletion, lookup
    • Good analysis properties
      • O(logN) routing table size
      • O(logN) logic steps to reach the successor of a key k
    • High maintenance cost
      • Node join/leave induces state change on other nodes
  • Pastry
    • Circular namespace
    • Routing Table:
      • Peer p, ID: IDp
      • For each prefix of IDp, keep a set of peers who shares the prefix and the next digit is different from each other .
    • Routing:
      • Choose a peer whose ID shares the longest prefix with target ID
      • Choose a peer whose ID is numerically closest to target ID
    • Similar analysis properties with Chord
  • The Problem of Churn
    • The continuous process of node arrival and departure in DHTs.
    • One metric of churn is session time- time between when a node joins the network until the next time it leaves. Also consider lifetime and availability.
    • Even temporary loss of routing neighbor weakens the correctness and performance of DHTs.
    • Unavailability of neighbors reduce a node’s effective connectivity, forcing it to choose suboptimal routes and increasing the failures.
  • Experiment results
    • Pastry fails to complete a majority of lookup requests under heavy churn because nodes waits so long on for request messages.
    • Chord performs well and return consistent results under lower churn rates but shortcoming is lookup latency.
    • Under churn DHT
      • may fail to complete lookup request.
      • It will complete but return inconsistent results.
      • It will complete and return consistent results but suffer from dramatic increase in lookup latency.
  • Handling Churn
    • Factors that effect the behavior of DHTs under churn
        • Reactive versus periodic recovery from neighbor failures.
        • Calculation of good timeout values for lookup messages.
        • Techniques to achieve proximity in neighbor selection.
  • Reactive recovery
    • A node reacts to loss of one of its existing leaf set neighbors by sending a copy of its leaf set to every node in it.
    • Total number of messages O(k^2) in
    • k-node network.
    • Without churn it is very efficient, as messages are sent only in response to actual changes.
    • It consumes more bandwidth as churn rates increases.
  • Periodic Recovery
    • Node periodically shares its leaf set with each node in leaf set.
    • Share with one random node makes better improvement.
    • Number of messages exchanged are O(log k).
    • It aggregates all changes in each period into a single message.
    • Positive feedback cycles in reactive recovery.
  • Timeout Calculation
    • A node should ensure that the timeout for a request was judiciously selected before routing to an alternative neighbor.
    • Techniques:
      • TCP-style timeouts. (recursive routing)
      • RTO=AVG+4*VAR.
      • Timeouts from virtual coordinates. (Iterative routing)
      • uses distributed machine learning
      • D(N1,N2) proportional to network latency
      • Both strategies provide similar mean latency at low churn rates.
  • Proximity Neighbor Selection
    • The process of choosing among the potential neighbors for any given routing table entry according to their network latency to the choosing node.
    • Techniques:
    • Neighbors’ neighbors.
    • Neighbors’ inverse neighbors.
  • Proximity neighbor selection
    • Algorithm:
    • - Use the algorithm to find nodes that may be near to the local node.
    • - Measure the latency to those nodes.
    • - If we have no existing neighbor in routing table fill it with new node or if node is already present replace it with the new node.