Hash table

EXAMPLE
• Design a system to store employees' information using their phone number as
key
• Operations: Insert, Search, Delete
• Some possible data structures:
• Array
• Linked List
• Balanced Binary Search Tree
• Direct Access Table

EXAMPLE
• Design a system to store employees' information using their phone number as
key
• Operations: Insert, Search, Delete
• Some possible data structures:
• Array: O(n) Search, Delete
• Linked List: O(n) Search
• Balanced Binary Search Tree: O(log n) All
• Direct Access Table: Space Wastage
=> Hash Table: O(1) All

BASICS
• Data Structure that implements Associative
Array
• Map key to corresponding value
• Use Hash function to compute index of key-
value pairs into an array of buckets
• O(1) complexity on average and O(n) in worst
case

HASHING
• Distribute the entries (key-value pairs) across an array of buckets
• Hash function: Map data of arbitrary size to data of fixed size
• Two steps:
1. hash = hash_func(key)
2. index = hash % table_size

CHOOSING A HASH FUNCTION
• Easy to compute
• Uniform Distribution

TYPES OF HASH FUNCTION
• Two types:
• Cryptographic hash
• Non-cryptographic hash
• Non-cryptographic hash provides weaker
guarantees than cryptographic hash in
exchange for performance improvements
• Example:
• Crypto: BLAKE2b, SHA-512, MD5, …
• Non-crypto: MurmurHash, xxHash, ...
• Cryptographic hash aims to provide
certain security guarantees
• Main properties of cryptographic hash:
• Deterministic
• Quick
• One-way function
• Avalanche effect
• Collision resistant
• Pre-image attack resistant

COLLISION RESOLUTION
• Two or more keys result in a same hash value
• Practically unavoidable
• Handling techniques:
• Separate chaining
• Open addressing

SEPARATE CHAINING
• Make each cell of hash table point to a linked list of records that have same hash
function value

SEPARATE CHAINING
• Make each cell of hash table point to a linked list of records that have same hash
function value
• Advantages:
• Simple to implement
• Hash table never fills up
• Disadvantages:
• Cache performance
• Space wastage
• Search time can become O(n) if chain
gets long

OPEN ADDRESSING
• All elements are stored in the hash table itself
• Operations:
• Insert: Keep probing until an empty slot is found
• Search: Keep probing until key is found or an empty slot is reached
• Delete: If we simply delete a key, then search may fail. So slots of deleted keys are
marked specially as DELETED

OPEN ADDRESSING
Types:
• Linear probing: Linearly probe for next slot
index = [hash(x) + i] % S

OPEN ADDRESSING
Types:
• Linear probing: Linearly probe for next slot => Clustering

OPEN ADDRESSING
Types:
• Quadratic probing: Look for i^2 slot in ith iteration
index = [hash(x) + i^2] % S

OPEN ADDRESSING
Types:
• Quadratic probing: Look for i^2 slot in ith iteration
index = [hash(x) + i^2] % S
• Double hashing: Use another hash function hash2(x) and look for i*hash2(x) in ith
iteration
index = [hash(x) + i*hash2(x)] % S

OPEN ADDRESSING
Comparison:
• Linear probing:
• Easy to compute
• Best cache performance
• Suffers from Clustering
• Quadratic probing:
• Lies between cache performance and clustering
• Double hashing:
• Poor cache performance
• No clustering
• More computation time

OPEN ADDRESSING
• Advantages:
• Better cache performance
• Better space usage
• Disadvantages:
• Harder to implement
• Hash table may become full
• Clustering

DYNAMIC RESIZING
• Load factor = number of entries / number of buckets
• When load factor is too low or too high => Dynamic resizing
• Approaches:
• Complete resizing
• Incremental resizing

USAGE
• Associative Array
• Database Indexing
• Cache
• Set
• …

REFERENCE
• Wikipedia
• GeeksForGeeks

Hash table

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Hash table

Similar to Hash table (20)

Recently uploaded

Recently uploaded (20)

Hash table

Editor's Notes