• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Distributed Hash Table

Distributed Hash Table






Total Views
Views on SlideShare
Embed Views



1 Embed 4

http://www.slideshare.net 4



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Distributed Hash Table Distributed Hash Table Presentation Transcript

    • Distributed Hash Table
    • Definition
      • It is a class of decentralized distributed system that provide lookup service similar to hash table (key, value pair).
      • Responsibility for maintaining the mapping from keys to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption
    • DHT- Structure
      • Key space partitioning scheme splits keyspace among the participating nodes.
      • An overlay network that connects the nodes, allowing them to find of given key in the keyspace.
      • Hash algorithm (SHA-1).
      • Consistent hashing that provides removal or addition of one node changes only the set of keys owned by the nodes with adjacent IDs, and leaves all other nodes unaffected.
    • How Lookup works in DHT
      • Leaf set is successors and predecessors
        • All that’s needed for correctness
      • Routing table matches successively longer prefixes
      • - All that’s needed for performance.
      Response Lookup ID Source
    • Chord
      • One of the original distributed hash table developed at MIT.
      • Nodes are arranged in circle
      • IDs and key are assigned m-bit identifier using consistent hashing
    • Chord-properties
      • Efficient directory operations
        • Insertion, deletion, lookup
      • Good analysis properties
        • O(logN) routing table size
        • O(logN) logic steps to reach the successor of a key k
      • High maintenance cost
        • Node join/leave induces state change on other nodes
    • Pastry
      • Circular namespace
      • Routing Table:
        • Peer p, ID: IDp
        • For each prefix of IDp, keep a set of peers who shares the prefix and the next digit is different from each other .
      • Routing:
        • Choose a peer whose ID shares the longest prefix with target ID
        • Choose a peer whose ID is numerically closest to target ID
      • Similar analysis properties with Chord
    • The Problem of Churn
      • The continuous process of node arrival and departure in DHTs.
      • One metric of churn is session time- time between when a node joins the network until the next time it leaves. Also consider lifetime and availability.
      • Even temporary loss of routing neighbor weakens the correctness and performance of DHTs.
      • Unavailability of neighbors reduce a node’s effective connectivity, forcing it to choose suboptimal routes and increasing the failures.
    • Experiment results
      • Pastry fails to complete a majority of lookup requests under heavy churn because nodes waits so long on for request messages.
      • Chord performs well and return consistent results under lower churn rates but shortcoming is lookup latency.
      • Under churn DHT
        • may fail to complete lookup request.
        • It will complete but return inconsistent results.
        • It will complete and return consistent results but suffer from dramatic increase in lookup latency.
    • Handling Churn
      • Factors that effect the behavior of DHTs under churn
          • Reactive versus periodic recovery from neighbor failures.
          • Calculation of good timeout values for lookup messages.
          • Techniques to achieve proximity in neighbor selection.
    • Reactive recovery
      • A node reacts to loss of one of its existing leaf set neighbors by sending a copy of its leaf set to every node in it.
      • Total number of messages O(k^2) in
      • k-node network.
      • Without churn it is very efficient, as messages are sent only in response to actual changes.
      • It consumes more bandwidth as churn rates increases.
    • Periodic Recovery
      • Node periodically shares its leaf set with each node in leaf set.
      • Share with one random node makes better improvement.
      • Number of messages exchanged are O(log k).
      • It aggregates all changes in each period into a single message.
      • Positive feedback cycles in reactive recovery.
    • Timeout Calculation
      • A node should ensure that the timeout for a request was judiciously selected before routing to an alternative neighbor.
      • Techniques:
        • TCP-style timeouts. (recursive routing)
        • RTO=AVG+4*VAR.
        • Timeouts from virtual coordinates. (Iterative routing)
        • uses distributed machine learning
        • D(N1,N2) proportional to network latency
        • Both strategies provide similar mean latency at low churn rates.
    • Proximity Neighbor Selection
      • The process of choosing among the potential neighbors for any given routing table entry according to their network latency to the choosing node.
      • Techniques:
      • Neighbors’ neighbors.
      • Neighbors’ inverse neighbors.
    • Proximity neighbor selection
      • Algorithm:
      • - Use the algorithm to find nodes that may be near to the local node.
      • - Measure the latency to those nodes.
      • - If we have no existing neighbor in routing table fill it with new node or if node is already present replace it with the new node.