Hashing
Department of Computer Science
Islamia College Univerisity Peshawar

Fall 2012 Semester
BCS course: CS 00 Analysis...
Dictionary
 Holds n records

 What data structure should be used to implement T?
12/30/13

Lecture #9 Adapted from slide...
Hashing

12/30/13

Lecture #9 Adapted from slides by Dr
Onaiza Maqbol

Wednesday, March 18, 2009
Direct Addressing
 Assumptions



The set of keys
Keys are distinct



Create a table T[0..u-1]



Benefit
 Each ope...
Hashing
 Solution


12/30/13

Use a hash function h to map the universe U of all keys into {0, 1, …, m–
1}

Lecture #9 A...
Hash Table
 The mapped keys are stored into table called hash table
 The table consists of m cells
 A hash table requir...
Hashing Functions - Modulo Function
 Several functions can be used to map keys into a set of integers. The
choice is made...
Modulo Function (contd…)

12/30/13

Lecture #9 Adapted from slides by Dr
Onaiza Maqbol

Wednesday, March 18, 2009
Hashing Functions - Multiplication
Method

12/30/13

Lecture #9 Adapted from slides by Dr
Onaiza Maqbol

Wednesday, March ...
Hashing of Strings

12/30/13

Lecture #9 Adapted from slides by Dr
Onaiza Maqbol

Wednesday, March 18, 2009
ASCII Sum Method

12/30/13

Lecture #9 Adapted from slides by Dr
Onaiza Maqbol

Wednesday, March 18, 2009
Radix Method

12/30/13

Lecture #9 Adapted from slides by Dr
Onaiza Maqbol

Wednesday, March 18, 2009
Universal Hashing

12/30/13

Lecture #9 Adapted from slides by Dr
Onaiza Maqbol

Wednesday, March 18, 2009
Universal Hashing (contd…)
s
Ha,b(k)=((ak+b)modp)mod m where p is large enough so that every possible key k is in the rang...
Perfect Hashing

12/30/13

Lecture #9 Adapted from slides by Dr
Onaiza Maqbol

Wednesday, March 18, 2009
Perfect Hashing
0
1
2

m2

a2

b2

4

10

18

S2
60

75

3
…
8



12/30/13

Using perfect hashing to store {10, 22, 37, 4...
Collisions
 Two or more than two keys may hash to the same slot
 When a record to be inserted maps to an already occupie...
Collisions

12/30/13

Lecture #9 Adapted from slides by Dr
Onaiza Maqbol

Wednesday, March 18, 2009
Collision Resolution
 Two basic approaches to collision resolution are called chained
hashing and open address hashing
 ...
Collision Resolution by Chaining
 Records in the same slot are linked into a list

12/30/13

Lecture #9 Adapted from slid...
Collision Resolution by Chaining (contd…)

12/30/13

Lecture #9 Adapted from slides by Dr
Onaiza Maqbol

Wednesday, March ...
Analysis of Hashing with Chaining
 How long does it take to search for an element with a given key?
 Let n be the number...
Worst Hashing - Searching



All hash keys are mapped to a single list.



This situation may be referred to as worst di...
Worst Hashing - Insertion
 The worst case running time for insertion is θ(1)
 The assumption is that the key is not alre...
Simple Uniform Hashing - Searching
 The keys are uniformly distributed among all the linked lists i.e. it is
assumed that...
Simple Uniform Hashing - Searching



Two cases





Unsuccessful search
Successful search

Unsuccessful search


Exp...
Simple Uniform Hashing - Insertion
 In order to find average time for inserting a key, let us consider the case
when kth ...
Simple Uniform Hashing - Insertion


The insertion of new key would require probing of (k-1)/m keys plus the cost of
addi...
Simple Uniform Hashing - Searching
 Successful search
 We assume that element x to be searched is equally likely to be a...
Open Addressing
 All elements are stored in the hash table itself
 In open addressing, the hash table can fill up, so th...
Insertion
 We successively examine or probe the hash table until we find an
empty slot in which to put the key
 The sequ...
Pseudo code
HASH-INSERT(T, k)
1. i ← 0
2. Repeat j ← h(k,i)
3.
if T[j]=NIL
4.
then T[j]←k
5.
return j
6.
else i ← i+1
7.
u...
Linear Probing
 In linear probing the hashed key is incremented by an integer value. In
general the hash function is defi...
Linear Probing (contd…)

12/30/13

Lecture #9 Adapted from slides by Dr
Onaiza Maqbol

Wednesday, March 18, 2009
Searching
HASH-SEARCH(T, k)
1. i ← 0
2. Repeat j ← h(k,i)
3.
if T[j]=k
4.
then return j
5.
i ← i+1
6.
until T[j]=NIL or i=...
Quadratic Probing

12/30/13

Lecture #9 Adapted from slides by Dr
Onaiza Maqbol

Wednesday, March 18, 2009
Quadratic Probing

12/30/13

Lecture #9 Adapted from slides by Dr
Onaiza Maqbol

Wednesday, March 18, 2009
Quadratic Probing

12/30/13

Lecture #9 Adapted from slides by Dr
Onaiza Maqbol

Wednesday, March 18, 2009
Upcoming SlideShare
Loading in...5
×

Hashing

554

Published on

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
554
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
30
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Hashing

  1. 1. Hashing Department of Computer Science Islamia College Univerisity Peshawar Fall 2012 Semester BCS course: CS 00 Analysis of Algorithms Course Instructor: Mr. Zahid 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol
  2. 2. Dictionary  Holds n records  What data structure should be used to implement T? 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  3. 3. Hashing 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  4. 4. Direct Addressing  Assumptions   The set of keys Keys are distinct  Create a table T[0..u-1]  Benefit  Each operation takes constant time  Drawbacks  The range of keys can be large 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  5. 5. Hashing  Solution  12/30/13 Use a hash function h to map the universe U of all keys into {0, 1, …, m– 1} Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  6. 6. Hash Table  The mapped keys are stored into table called hash table  The table consists of m cells  A hash table requires much less storage than a direct address table  With direct addressing, an element in key k is stored in slot k, with hashing, this element is stored in slot h(k)  So the hash function h : U → {0, 1, …., m-1}  h(k) is also called hash value of key k 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  7. 7. Hashing Functions - Modulo Function  Several functions can be used to map keys into a set of integers. The choice is made on the basis of amount of computation time required, and simplicity of the computational steps. A common choice is a modulo function h(x) defined as: h(k) = k mod m where k is the key, m is some positive integer and mod denotes the modulus operator which computes the remainder of key k divided by m.  It follows that the hash function h(x) maps the set of keys {k1, k2, k3, …….kn} into a set of integers {0,1,2,……m-1}  In essence, the modulo function is used to create a hash table of size m 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  8. 8. Modulo Function (contd…) 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  9. 9. Hashing Functions - Multiplication Method 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  10. 10. Hashing of Strings 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  11. 11. ASCII Sum Method 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  12. 12. Radix Method 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  13. 13. Universal Hashing 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  14. 14. Universal Hashing (contd…) s Ha,b(k)=((ak+b)modp)mod m where p is large enough so that every possible key k is in the range 0 to p-1, inclusive, and 0<a<p and 0<=b<p belongs to the the family of universal functions mod 6 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  15. 15. Perfect Hashing 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  16. 16. Perfect Hashing 0 1 2 m2 a2 b2 4 10 18 S2 60 75 3 … 8  12/30/13 Using perfect hashing to store {10, 22, 37, 40, 60, 70, 75}, outer hash function is Ha,b(k)=((ak+b)modp)mod m where a=3, b=42, p=101, and m=9. e.g. h(75)=2. Since h2(75)=1, 75 is stored in slot1 of secondary hash table Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  17. 17. Collisions  Two or more than two keys may hash to the same slot  When a record to be inserted maps to an already occupied slot in T, a collision occurs  Can we avoid collisions altogether?  Not if |U| > m  We need a method to resolve collisions that occur 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  18. 18. Collisions 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  19. 19. Collision Resolution  Two basic approaches to collision resolution are called chained hashing and open address hashing  Chained Hashing: In chained hashing the elements of a hash table are stored in a set of linked lists.  All colliding elements are kept in one linked list.  The list head pointers are usually stored in an array.  Chained hashing is also known as open hashing  Open Address Hashing: In open address hashing, the hashed keys are stored in the hash table itself.  The colliding keys are allocated distinct cells in the table.  Open address hashing is also referred to as closed hashing 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  20. 20. Collision Resolution by Chaining  Records in the same slot are linked into a list 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  21. 21. Collision Resolution by Chaining (contd…) 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  22. 22. Analysis of Hashing with Chaining  How long does it take to search for an element with a given key?  Let n be the number of keys in the table, and let m be the number of slots  Define the load factor of T to be α = n/m = average number of keys per slot  Analysis is in terms of α, which can be less than, equal to, or greater than 1 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  23. 23. Worst Hashing - Searching  All hash keys are mapped to a single list.  This situation may be referred to as worst distribution of hash keys  In practice, this extreme situation may not arise, but nevertheless, possibility does exist  Worst case time for searching is thus θ(n), plus time to compute the hash function  The best search time is θ(1), since the key will be found in the front node  On an average, half the list will be examined. Thus, average search time is θ(n) 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  24. 24. Worst Hashing - Insertion  The worst case running time for insertion is θ(1)  The assumption is that the key is not already present in the table  To check presence, search of the key is required – As just mentioned, worst case time of searching is θ(n)  Thus worst case running time of insertion is θ(n)  Average cost running time of insertion is also θ(n) 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  25. 25. Simple Uniform Hashing - Searching  The keys are uniformly distributed among all the linked lists i.e. it is assumed that any given element is equally likely to hash into any of the m slots  Let us denote length of the list T[j] for j= 0,1,…., m-1 by nj so that n=n0+n1+…+nm-1 and the average value of nj=E[nj] = α = n/m  We assume that hash value h(k) can be computed in O(1) time  So time required to search for an element with key k depends linearly on the length nh(k) of the list T[h(k)] 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  26. 26. Simple Uniform Hashing - Searching  Two cases    Unsuccessful search Successful search Unsuccessful search  Expected time to search unsuccessfully for a key k is the expected time to search to the end of list T[h(k)], which has the expected length E[nh(k)]= α  Thus total time required is θ(1+ α) 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  27. 27. Simple Uniform Hashing - Insertion  In order to find average time for inserting a key, let us consider the case when kth key is inserted. At that stage, the list has already k-1 keys distributed uniformly over m linked lists. Thus, prior to insertion of kth key, the average length of each list is (k-1)/m, as shown in the diagram 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  28. 28. Simple Uniform Hashing - Insertion  The insertion of new key would require probing of (k-1)/m keys plus the cost of adding new key.  Thus, the overall cost of insertion of kth key is 1+(k-1)/m, assuming that each operation consumes unit time 1.  The expected cost of inserting a key is obtained by summing over all possible values of k. Thus, the expected cost I is given by  The average cost of inserting key is 1+ α /2- 1/2m = θ(1+ α) 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  29. 29. Simple Uniform Hashing - Searching  Successful search  We assume that element x to be searched is equally likely to be any of the n elements stored in the table  The number of elements examined is one more than number of elements that appear before x is x’s list  Elements before x in the list were all placed after x was inserted  Total time required for a successful search is 1+ α /2- α /2n = θ(1+ α)  If n=O(m), α=n/m=O(m)/m=1  Thus searching takes constant time on average 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  30. 30. Open Addressing  All elements are stored in the hash table itself  In open addressing, the hash table can fill up, so that no further insertions can be made  The load factor α can never exceed 1  Advantage is that open addressing avoids pointers altogether  Extra memory freed provides hash table with a larger number of slots for the same amount of memory 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  31. 31. Insertion  We successively examine or probe the hash table until we find an empty slot in which to put the key  The sequence of positions probed depends upon the key being inserted  To determine which points to probe, we extend hash functions to include the probe number as a second input. Thus hash function becomes: h : U x {0, 1, …., m-1} → {0, 1, …., m-1} 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  32. 32. Pseudo code HASH-INSERT(T, k) 1. i ← 0 2. Repeat j ← h(k,i) 3. if T[j]=NIL 4. then T[j]←k 5. return j 6. else i ← i+1 7. until i=m 8. Error “Table full” 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  33. 33. Linear Probing  In linear probing the hashed key is incremented by an integer value. In general the hash function is defined as function h(k,i)=( h’(k)+ i) mod m, where h’(k) is an auxiliary hash function and m is the table size. 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  34. 34. Linear Probing (contd…) 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  35. 35. Searching HASH-SEARCH(T, k) 1. i ← 0 2. Repeat j ← h(k,i) 3. if T[j]=k 4. then return j 5. i ← i+1 6. until T[j]=NIL or i=m 7. Return NIL 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  36. 36. Quadratic Probing 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  37. 37. Quadratic Probing 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  38. 38. Quadratic Probing 12/30/13 Lecture #9 Adapted from slides by Dr Onaiza Maqbol Wednesday, March 18, 2009
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×