2. Hashing
• Another important and widely useful
technique for implementing dictionaries
• Constant time per operation (on the
average)
• Worst case time proportional to the size
of the set for each operation (just like
array and chain implementation)
3. Basic Idea
• Use hash function to map keys into
positions in a hash table
Ideally
• If element e has key k and h is hash
function, then e is stored in position h(k)
of table
• To search for e, compute h(k) to locate
position. If no element, dictionary does
not contain e.
4. Example
• Dictionary Student Records
– Keys are ID numbers (951000 - 952000),
no more than 1000 students
– Hash function: h(k) = k-951000 maps ID
into distinct table positions 0-1000
– array table[1001]
...
0 1 2 3 1000
hash table
buckets
5. Analysis (Ideal Case)
• O(b) time to initialize hash table (b
number of positions or buckets in hash
table)
• O(1) time to perform insert, remove,
search
6. Hash Functions
• If key range too large, use hash table with
fewer buckets and a hash function which maps
multiple keys to same bucket:
h(k1) = = h(k2): k1 and k2 have collision at
slot
• Popular hash functions: hashing by division
h(k) = k%D, where D number of buckets in
hash table
• Example: hash table with 11 buckets
h(k) = k%11
80 3 (80%11= 3), 40 7, 65 10
58 3 collision!, 24?, 35?
7. Hash Table
• In this example, the key
is a long integer field
called Number.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ]
. . .
[ 700]
[ 4 ]
Number 506643548
8. Hash Table
• The number might be a
person's identification
number, and the rest of
the record has
information about the
person.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ]
. . .
[ 700]
[ 4 ]
Number 506643548
9. • Hash table positions indexed 0 through D-1,
called buckets
• f(k)is home bucket for element with key k
• collision – if the home bucket is already
occupied by a different value, A bucket may
have space for more than one element
• overflow – if there is no room for new element
in the bucket
• Linear open addressing – using next available
bucket when overflow
10. Hash Table
• When a hash table is in
use, some spots contain
valid records, and other
spots are "empty".
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548
Number 233667136
Number 281942902
Number 155778322
. . .
11. Inserting a New Record
• In order to insert a new
record, the key must
somehow be converted
to an array index.
• The index is called the
hash value of the key.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548
Number 233667136
Number 281942902
Number 155778322
. . .
Number 580625685
12. Inserting a New Record
• Typical way create a hash
value:
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548
Number 233667136
Number 281942902
Number 155778322
. . .
Number 580625685
(Number mod 701)
What is (580625685 mod 701) ?
13. Inserting a New Record
• Typical way to create a
hash value:
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548
Number 233667136
Number 281942902
Number 155778322
. . .
Number 580625685
(Number mod 701)
What is (580625685 mod 701) ?
3
14. Inserting a New Record
• The hash value is used
for the location of the
new record.
Number 580625685
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548
Number 233667136
Number 281942902
Number 155778322
. . .
[3]
15. Inserting a New Record
• The hash value is used
for the location of the
new record.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548
Number 233667136
Number 281942902
Number 155778322
. . .
Number 580625685
16. Collisions
• Here is another new
record to insert, with a
hash value of 2.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548
Number 233667136
Number 281942902
Number 155778322
. . .
Number 580625685
Number 701466868
My hash
value is
[2].
17. Collisions
• This is called a collision,
because there is already
another valid record at
[2].
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548
Number 233667136
Number 281942902
Number 155778322
. . .
Number 580625685
Number 701466868
When a collision occurs,
move forward until you
find an empty spot.
18. Collisions
• This is called a collision,
because there is already
another valid record at
[2].
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548
Number 233667136
Number 281942902
Number 155778322
. . .
Number 580625685
Number 701466868
When a collision occurs,
move forward until you
find an empty spot.
19. Collisions
• This is called a collision,
because there is already
another valid record at
[2].
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548
Number 233667136
Number 281942902
Number 155778322
. . .
Number 580625685
Number 701466868
When a collision occurs,
move forward until you
find an empty spot.
20. Collisions
• This is called a collision,
because there is already
another valid record at
[2].
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548
Number 233667136
Number 281942902
Number 155778322
. . .
Number 580625685 Number 701466868
The new record goes
in the empty spot.
21. Class Defn. for Hash Table
template<class E , class K>
Class HashTable {
public :
HashTable(int divisor = 11);
~HashTable () {
delete [ ] ht ;
delete [ ] empty;
bool Search(const K& k , E& e) const;
HashTable<E,K>& Insert ( const E& e);
private :
int hSearch (const K& k) const ;
int D; // hash function divisor
E *ht ; // hash table array
bool *empty ; // 1D array
};
22. Constructor for HashTable
template<class E , class K>
HashTable<E,K> :: HashTable( int divisor)
{ D = divisor; // Constructor
ht = new E [D]; // Allocate Hash Array
empty = new bool [D];
for ( int i=0 ; I <D ; i++)
empty[i] = true;
}
23. Searching for a Key
• The data that's attached
to a key can be found
fairly quickly.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548
Number 233667136
Number 281942902
Number 155778322
. . .
Number 580625685 Number 701466868
Number 701466868
24. Searching for a Key
• Calculate the hash value.
• Check that location of the
array for the key.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548
Number 233667136
Number 281942902
Number 155778322
. . .
Number 580625685 Number 701466868
Number 701466868
My hash
value is
[2].
Not me.
25. Searching for a Key
• Keep moving forward until
you find the key, or you
reach an empty spot.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548
Number 233667136
Number 281942902
Number 155778322
. . .
Number 580625685 Number 701466868
Number 701466868
My hash
value is
[2].
Not me.
26. Searching for a Key
• Keep moving forward until
you find the key, or you
reach an empty spot.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548
Number 233667136
Number 281942902
Number 155778322
. . .
Number 580625685 Number 701466868
Number 701466868
My hash
value is
[2].
Not me.
27. Searching for a Key
• Keep moving forward until
you find the key, or you
reach an empty spot.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548
Number 233667136
Number 281942902
Number 155778322
. . .
Number 580625685 Number 701466868
Number 701466868
My hash
value is
[2].
Yes!
28. Searching for a Key
• When the item is found, the
information can be copied to
the necessary location.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548
Number 233667136
Number 281942902
Number 155778322
. . .
Number 580625685 Number 701466868
Number 701466868
My hash
value is
[2].
Yes!
29. Search Function
template<class E , class K>
int HashTable<E,K> : : hSearch(const K& k) const
{ //Search an open addressed table
// Return location of K if present
// Otherwise return insert point if there is space
int i = k%D; // home bucket
int j = i; // start at home bucket
do {
if (empty[j] || ht[j] == k ) return j;
j = (j+1)%D; // next bucket
} while ( j != i); // returned to home ?
return j ; // table full
}
template<class E , class K>
Bool HashTable<E,K> :: Search(const K& k E& e) const
{ int b = hsearch(k); // put element that matches k in e
if (empty[b] || ht[b] != k)
return false ; // return false if no match
e = ht[b];
return true;
}
30. template<class E , class K>
HashTable(E,K>& HashTable<E,K> :: insert(const E& e)
{ // Hash Table Insert
K k = e ; // Extract key
int b = hSearch(k);
// check if insert to be done
if (empty[b]){
empty[b] = false;
ht[b] = e;
return *this; }
// no insert , check if duplicate or full
if (h[b] == k)
throw BadInput(); // Duplicate
throw NoMem(); // Table is full
}
Insertion into a Hash Table
31. Deleting a Record
• Records may also be deleted from a hash
table.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 506643548
Number 233667136
Number 281942902
Number 155778322
. . .
Number 580625685 Number 701466868
Please
delete me.
32. Deleting a Record
• Records may also be deleted from a hash table.
• But the location must not be left as an ordinary
"empty spot" since that could interfere with
searches.
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 233667136
Number 281942902
Number 155778322
. . .
Number 580625685 Number 701466868
33. Deleting a Record
[ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700]
Number 233667136
Number 281942902
Number 155778322
. . .
Number 580625685 Number 701466868
• Records may also be deleted from a hash table.
• But the location must not be left as an ordinary
"empty spot" since that could interfere with
searches.
• The location must be marked in some special
way so that a search can tell that the spot used
to have something in it.
35. Hash tables store a collection of records with
keys.
The location of a record depends on the hash
value of the record's key.
When a collision occurs, the next available
location is used.
Searching for a particular key is generally quick.
When an item is deleted, the location must be
marked in a special way, so that the searches
know that the spot used to be used.
Summary