2. Hashing
When implementing a hash table using arrays, the nodes/elements are not stored
consecutively, instead the location of an element/node to be
inserted/deleted/searched is computed using the key and a hash function.
3. Hash Table
• Hash table is a data structure which is designed specifically
with the objective of providing efficient insertion and
searching operations (deletion is not a primary objective).
• Hash Table achieves constant time performance O(1) time
• To achieve constant time performance objective, the
implementation must be based in some way on an array
rather than a linked list.
4. Hash Table
• Hash table is a container which will be used to hold some number
of items of a given set K. We call the elements of the set K keys and
K is called the key space
• In the general case, we expect the size of the set of keys, denoted
as |K|, to be relatively large in comparison with the number of
items stored in the table of size M
• The general approach is to store the keys in an array. The position
(also called location or index) i of a key k in the array is given by a
function hash(k) or f(k), called a hash function
• Hashing – Hash function determines the position of a given key
directly from that key (i.e., i = f(k) is called the hash code of k)
Example :For a Student Record USN is the key
Example USN= 1234
Value = Hash(1234) or value =f(1234) where value is the hash value of
the Student Record with key(USN) 1234.
5. Hashing : Hash Function &
Hash Collision
• Hash Function f
f: K → {0, 1, ..., M-1}.
• Hash function f maps (or transforms) the set of key values(K) to
subscripts (indexes) in an array(Table) of size M.
• In general, since |K| >> M, the mapping defined by a hash
function will be a many-to-one mapping.
• Hash Collision : Hash Collision occurs when hash function
maps distinct keys x and y to the same index, i.e. When x not
equal to y f(x) = f(y) .
• Collisions can be reduced with a selection of a good hash
function
6. Hashing: Hash Function Example
Example : A record with key value 37
f(37) : 37%10
= 7
A record with key value 37 is mapped to index 7
Example : A record with key value 87
f(87) : 87%10
= 7
A record with key value 37 is mapped to index 7
Two records/items with different key values ie
37,87 mapped to same index ie 7 of the hash
table/array results in collision
7. Hashing: Hash Function Example
Example:
•K = {0, 1, ..., 199}, ie Number of different
key values =200
•Hash Table Size (Array Size) M =10
•Hash Function -for each key k in K,
Hash function Defintion
f(k) : k % M
i.e. f(k) : k % Table Size
i.e. f(k) : k % 10
Example : A record with key value 25
f(25) : 25%10
= 5
A record with key value 25 is mapped to
index 5
10. Hash Function
The characteristics of a good hash function are
as follows.
• It avoids collisions.
• It tends to spread keys evenly in the array.
• It is easy to compute (i.e., computational time
of a hash function should be O(1)
11. Collision Resolution Techniques
Collision Resolution Techniques
• Open Hashing : In open hashing, keys are stored in linked lists
attached to cells of a hash table
Chaining( Separate Chaining)
• Closed Hashing(Open Addressing):In closed hashing, all keys
are stored in the hash table itself without the use of linked
lists.
Linear Probing Method
Quadratic Probing Method
Double Hashing Method
12. Separate Chaining
• When an element needs to be searched on the hash table, the hash
function f(k) = k % M will specify the address i within the range [0,
M -1]corresponding the singly linked list i that may contain the
node.
• Searching an element on the hash table turns out the problem of
searching an element on the singly linked list
• The hash table is implemented by using singly linked list.
• Elements on the hash table are hashed into M (e.g., M = 10) singly
linked lists (from list 0 to list M-1).
• Elements conflicted at the address i are directly connected in the
list i, where 0 ≤i ≤M-1.
• When an element with the key k is added into the hash table, the
hash function f(k) = k % M will identify the address i between 0 and
M -1 corresponding the singly linked list i where this node will be
added
14. Linear Probing Method
• All elements are stored in the hash table itself
• The hash table is initialized, all M locations assigned to -1 (i.e., empty or vacant).
• When a node with the key k needs to be added into the hash table, the hash function
f(k) = k % M
will specify the address/index
i = f(k) (i.e., an index of an array)
within the range [0, M-1].
• If there is no conflict (i.e., the cell i is unoccupied), then this node is added into the hash
table at the address i.
• If a collision takes place, then the hash function rehashes first time to consider the
next address (i.e., i + 1).
• If conflict occurs again, then the hash function rehashes second time to examine the
next address (i.e., i + 2).
• This process repeats until the available address found then this node will be added at
this address.
15. 15
Linear probing: Inserting a key
• Idea: when there is a collision, check the next available
position in the table (i.e., probing)
h(k,i) = (h1(k) + i) mod m
i=0,1,2,...
• First slot probed: h1(k)
• Second slot probed: h1(k) + 1
• Third slot probed: h1(k)+2, and so on
• Can generate m probe sequences maximum, why?
probe sequence: < h1(k), h1(k)+1 , h1(k)+2 , ....>
wrap around
16. 16
Linear probing: Deleting a key
• Problems
– Cannot mark the slot as empty
– Impossible to retrieve keys inserted
after that slot was occupied
• Solution
– Mark the slot with a sentinel value
DELETED
• The deleted slot can later be used
for insertion
• Searching will be able to find all the
keys
0
m - 1
26. 26
Double Hashing
(1) Use one hash function to determine the first slot
(2) Use a second hash function to determine the
increment for the probe sequence
h(k,i) = (h1(k) + i h2(k) ) mod m, i=0,1,...
• Initial probe: h1(k)
• Second probe is offset by h2(k) mod m, so on ...
• Advantage: avoids clustering
• Disadvantage: harder to delete an element
• Can generate m2 probe sequences maximum
27. 27
Double Hashing: Example
h1(k) = k mod 13
h2(k) = 1+ (k mod 11)
h(k,i) = (h1(k) + i h2(k) ) mod 13
• Insert key 14:
h1(14,0) = 14 mod 13 = 1
h(14,1) = (h1(14) + h2(14)) mod 13
= (1 + 4) mod 13 = 5
h(14,2) = (h1(14) + 2 h2(14)) mod 13
= (1 + 8) mod 13 = 9
79
69
98
72
50
0
9
4
2
3
1
5
6
7
8
10
11
12
14
29. For the input 30, 20, 56, 75, 31, 19 and hash function h(K) = K mod 11
construct the open hash table.
• find the largest number of key comparisons in a successful search in
this table.
• find the average number of key comparisons in a successful search
in this table.
For the input 30, 20, 56, 75, 31, 19 and hash function h(K) = K mod 11
construct the closed hash table.
• find the largest number of key comparisons in a successful search
in this table
• find the average number of key comparisons in a successful search
in this table.