Open addressiing &rehashing,extendiblevhashing
1.
2. Open addressing hashing is an alternative to
resolving collisions with linked list.
Separate chaining hashing has the
disadvantage of using linked lists.
The algorithm down a bit because of the time
to allocate new cells.
Its essentially requires the implements of a
second data structure.
4. The amounts to trying cells sequentially in
search of empty cell.
The result of inserting keys {89,18,49,58,69}
into a hash table.
The collision resolution strategy ,f(i)=i.
The first collision occurs when 49 is inserted;
in spot 0,which is open.
Unsuccessful search ½(1+1/(1-ℷ)2)
Successful search ½(1+1/(1-ℷ))
5. 0
1
2
3
4
5
6
7
8
9
Linear Probing: after
checking spot h(k), try
spot h(k)+1, if that is full,
try h(k)+2, then h(k)+3,
etc.
Insert:
38
19
8
109
10
6. Empty table After 89 After 18 After 49 After 58 After 69
0 49 49 49
1 58 58
2 69
3
4
5
6
7
8 18 18 18 18
9 89 89 89 89 89
8. show for all 0 i,j size/2 and i j
(h(x) + i2) mod size (h(x) + j2) mod size
by contradiction: suppose that for some i j:
(h(x) + i2) mod size = (h(x) + j2) mod size
i2 mod size = j2 mod size
(i2 - j2) mod size = 0
[(i + j)(i - j)] mod size = 0
Because size is prime(i-j)or (i+j) must be zero, and
neither can be
9. Empty table After 89 After 18 After 49 After 58 After 69
0 49 49 49
1
2 58 58
3 69
4
5
6
7
8 18 18 18 18
9 89 89 89 89 89
10. The last collision resolution method
examine is double hashing.
Double hashing f(i)=i⋅hash2(x).
Hash function to x and probe at a
distance hash2(x),2hash2(x)…,
A function such as hash2(x)=R-(x mod
R), with R a prime smaller than Table
Size.
11. f(i) = i * g(k)
where g is a second hash function
Probe sequence:
0th probe = h(k) mod Table Size
1th probe = (h(k) + g(k)) mod Table Size
2th probe = (h(k) + 2*g(k)) mod Table Size
3th probe = (h(k) + 3*g(k)) mod Table Size
. . .
ith probe = (h(k) + i*g(k)) mod Table Size
12. Insert these values into the hash table in this
order. Resolve any collisions with double hashing:
13
28
33
147
43
Hash Functions:
H(K) = K mod M
H2(K) = 1 + ((K/M) mod (M-1))
13. When the table gets too full, create a bigger
table (usually 2x as large) and hash all the
items from the original table into the new
table.
To rehash
1) half full ( = 0.5)
2) when an insertion fails
3) some other threshold
14. 0 6
1 15
2
3 24
4
5
6 13
0 6
1 15
2 23
3 24
4
5
6 13
B) Open addressing hash
table with linear probing
after 23 is inserted
A) Open addressing hash
table with linear probing
with input 13,15,6,24
16. Extendible hashing accesses the data stored in
buckets indirectly through an index that is
dynamically adjusted to reflect changes in the file.
A hash function applied to a certain key indicates
a position in the index and not in the file (or table
or keys). Values returned by such a hash function
are called pseudo keys.
20. Expandable Hashing
But binary tree is used to store an index on the buckets.
Dynamic Hashing
multiple binary trees are used.
Outcome:
- To shorten the search.
- Based on the key --- select what tree to search.
21. Larson method
Index is simplified to be represented as a set of
binary trees.
Height of each tree is limited.
h(x) is searched in ALL trees.
Time: m – trees, k keys in each max, overall:
m*l gk.
Advantage: shorter search time in index file