DATA STRUCTURE
Chapter 10: Hashing & hash
table
Prepared & Presented by
Mr. Mahmoud R. Alfarra
2011-2012
College of Science & Technology
Dep. Of Computer Science & IT
BCs of Information Technology
http://mfarra.cst.ps
Out Line
 Introduction
 What is hashing?
 What is a hash function?
 Example: Hash function
 The Hashtable class
2
Introduction
 Hashing is a very common technique for storing
data in such a way the data can be inserted,
retrieved and deleted very quickly.
 Hashing uses a data structure called a hash
table.
 Operations that involve searching, such as
finding the minimum or maximum value, are not
performed very quickly. For these types of
operations, Binary search tree is preferred.
 The .NET Framework library provides a very
useful class for working with hash tables, the
3
What is hashing?
4
 A hash table data structure is designed around
an array. The array consists of elements 0
through some predetermined size, though we
can increase the size later if necessary.
 Each data item is stored in the array based on
some piece of the data, called the key.
 To store an element in the hash table, the key
is mapped into a number in the range of 0 to
the hash table size using a function called a
hash function.
What is hashing?
5
What is hashing?
6
Given a student ID find the record (entry)
Keys
What is a hash function?
7
Presented & Prepared by: Mahmoud R. Alfarra
Choosing a hash function depends on the data
type of the key you are using.
If your key is an integer, the simplest function is to
return the key modulo the size of the array.
f(k) = k%size
f(4) = 4%10 = 4
What is a hash function?
8
Presented & Prepared by: Mahmoud R. Alfarra
A hash table supports
 fast retrieval O(1)
 fast deletion O(1)
 fast insertion O(1)
Hash method works something like this
zzzzzzzz
Domain: "!" .. "zzzzzzzz" Range: 0 ... 9996
hash(key)
AAAAAAAA 8482
hash(key)
1273
Convert a String key into an integer that will be in the
range of 0 through the maximum capacity-1
Assume the array capacity is 9997
Hash method
 What if the ASCII value of individual chars of the
string key added up to a number from ("A") 65 to
possibly 488 ("zzzz") 4 chars max
 If the array has size = 309, mod the sum
390 % TABLE_SIZE = 81
394 % TABLE_SIZE = 85
404 % TABLE_SIZE = 95
 These array indices store these keys
81
85
95
abba
abcd
able
Example: hash function using string
11
Presented & Prepared by: Mahmoud R. Alfarra
Could use String keys each ASCII character equals
some unique integer
"able" = 97 + 98 + 108 + 101 == 404
81
82
83
84
85
86
308
A hash table after three insertions
Using the too simple hash code method
"abba"
Keys
80
...
0
Insert objects
with these three
keys:
"abba"
"abcd"
"abce"
...
"abcd"
"abce"
Collision occurs while inserting "baab"
can't insert "baab"
where it hashes to
same slot as
"abba"
Linear probe
forward by 1,
inserting it
at the next
available slot
"baab"
Try [81]
Put in [82]
81
82
83
84
85
86
308
"abba"
80
...
0
...
"abcd"
"abce"
"baab"
Wrap around when collision occurs at end
Insert
"KLMP"
"IKLT"
both of
which have
a hash
value of
308
81
82
83
84
85
86
308
"abba"
80
...
0
...
"abcd"
"abce"
"baab"
"KLMP"
"IKLT"
Find object with key (baab)
81
82
83
84
85
86
308
"abba"
80
...
0
...
"abcd"
"abce"
"baab"
"KLMP"
"IKLT"
"baab" still
hashes to 81,
but since [81]
does not hold
it, linear probe
to [82]
At this point,
you could
return a
reference to it
or remove it
The Hashtable class
16
‫البيانات‬ ‫تراكيب‬ ‫مساق‬
‫إعداد‬ ‫العلمية‬ ‫المادة‬
/
‫أ‬
.
‫ا‬ َّ‫الفــر‬ ‫رفيق‬ ‫محمود‬
 The Hashtable class is a special type of
Dictionary object, storing key–value pairs,
where the values are stored based on the
hash code derived from the key.
 The strategy the class uses to avoid collisions
is the concept of a bucket.
The Hashtable class
17
‫البيانات‬ ‫تراكيب‬ ‫مساق‬
‫إعداد‬ ‫العلمية‬ ‫المادة‬
/
‫أ‬
.
‫ا‬ َّ‫الفــر‬ ‫رفيق‬ ‫محمود‬
 A bucket is a virtual grouping of objects
together that have the same hash code.
 If two keys have the same hash code, they are
placed in the same bucket.
 Otherwise, each key with a unique hash code
is placed in its own bucket.
 The number of buckets used in a Hashtable
objects is called the load factor.
Instantiating a Hashtable
Object
18
‫البيانات‬ ‫تراكيب‬ ‫مساق‬
‫إعداد‬ ‫العلمية‬ ‫المادة‬
/
‫أ‬
.
‫ا‬ َّ‫الفــر‬ ‫رفيق‬ ‫محمود‬
 The Hashtable class is part of the
System.Collections namespace, so you must
import System.Collections at the beginning of
your program.
 A Hashtable object can be instantiated in one
of three ways.
 You can instantiate the hash table with an
initial capacity or by using the default capacity.
Instantiating a Hashtable
Object
19
 The following code demonstrates how to use
these three constructors:
Hashtable symbols = new Hashtable();
HashTable symbols = new Hashtable(50);
HashTable symbols = new Hashtable(25,
3.0);
The ratio of the number of elements in the hash table to the table size is
called the load factor.
Capacity (# of elements)
Adding Data to a Hashtable
Object
20
‫البيانات‬ ‫تراكيب‬ ‫مساق‬
‫إعداد‬ ‫العلمية‬ ‫المادة‬
/
‫أ‬
.
‫ا‬ َّ‫الفــر‬ ‫رفيق‬ ‫محمود‬
 Key–value pairs are entered into a hash table
using the Add method.
 This method takes two arguments:
1. The key
2. The value associated with the key.
 The key is added to the hash table after
computing its hash value.
Adding Data to a Hashtable
Object
21
 Here is some example code:
Hashtable symbols = new Hashtable(25);
symbols.Add("salary", 100000);
symbols.Add("name", "David Durr");
symbols.Add("age", 43);
symbols.Add("dept", "Information Technology");
Thank You …
22
Remember that: question is the key of knowledge
Ahl Eljanna 
‫السموات‬ ‫عرضها‬ ‫وجنة‬ ‫ربكم‬ ‫من‬ ‫مغفرة‬ ‫إىل‬ ‫وسارعوا‬
‫للمتقني‬ ‫أعدت‬ ‫واألرض‬
23
‫ق‬ ‫سورة‬

Chapter 10: hashing data structure

  • 1.
    DATA STRUCTURE Chapter 10:Hashing & hash table Prepared & Presented by Mr. Mahmoud R. Alfarra 2011-2012 College of Science & Technology Dep. Of Computer Science & IT BCs of Information Technology http://mfarra.cst.ps
  • 2.
    Out Line  Introduction What is hashing?  What is a hash function?  Example: Hash function  The Hashtable class 2
  • 3.
    Introduction  Hashing isa very common technique for storing data in such a way the data can be inserted, retrieved and deleted very quickly.  Hashing uses a data structure called a hash table.  Operations that involve searching, such as finding the minimum or maximum value, are not performed very quickly. For these types of operations, Binary search tree is preferred.  The .NET Framework library provides a very useful class for working with hash tables, the 3
  • 4.
    What is hashing? 4 A hash table data structure is designed around an array. The array consists of elements 0 through some predetermined size, though we can increase the size later if necessary.  Each data item is stored in the array based on some piece of the data, called the key.  To store an element in the hash table, the key is mapped into a number in the range of 0 to the hash table size using a function called a hash function.
  • 5.
  • 6.
    What is hashing? 6 Givena student ID find the record (entry) Keys
  • 7.
    What is ahash function? 7 Presented & Prepared by: Mahmoud R. Alfarra Choosing a hash function depends on the data type of the key you are using. If your key is an integer, the simplest function is to return the key modulo the size of the array. f(k) = k%size f(4) = 4%10 = 4
  • 8.
    What is ahash function? 8 Presented & Prepared by: Mahmoud R. Alfarra A hash table supports  fast retrieval O(1)  fast deletion O(1)  fast insertion O(1)
  • 9.
    Hash method workssomething like this zzzzzzzz Domain: "!" .. "zzzzzzzz" Range: 0 ... 9996 hash(key) AAAAAAAA 8482 hash(key) 1273 Convert a String key into an integer that will be in the range of 0 through the maximum capacity-1 Assume the array capacity is 9997
  • 10.
    Hash method  Whatif the ASCII value of individual chars of the string key added up to a number from ("A") 65 to possibly 488 ("zzzz") 4 chars max  If the array has size = 309, mod the sum 390 % TABLE_SIZE = 81 394 % TABLE_SIZE = 85 404 % TABLE_SIZE = 95  These array indices store these keys 81 85 95 abba abcd able
  • 11.
    Example: hash functionusing string 11 Presented & Prepared by: Mahmoud R. Alfarra Could use String keys each ASCII character equals some unique integer "able" = 97 + 98 + 108 + 101 == 404
  • 12.
    81 82 83 84 85 86 308 A hash tableafter three insertions Using the too simple hash code method "abba" Keys 80 ... 0 Insert objects with these three keys: "abba" "abcd" "abce" ... "abcd" "abce"
  • 13.
    Collision occurs whileinserting "baab" can't insert "baab" where it hashes to same slot as "abba" Linear probe forward by 1, inserting it at the next available slot "baab" Try [81] Put in [82] 81 82 83 84 85 86 308 "abba" 80 ... 0 ... "abcd" "abce" "baab"
  • 14.
    Wrap around whencollision occurs at end Insert "KLMP" "IKLT" both of which have a hash value of 308 81 82 83 84 85 86 308 "abba" 80 ... 0 ... "abcd" "abce" "baab" "KLMP" "IKLT"
  • 15.
    Find object withkey (baab) 81 82 83 84 85 86 308 "abba" 80 ... 0 ... "abcd" "abce" "baab" "KLMP" "IKLT" "baab" still hashes to 81, but since [81] does not hold it, linear probe to [82] At this point, you could return a reference to it or remove it
  • 16.
    The Hashtable class 16 ‫البيانات‬‫تراكيب‬ ‫مساق‬ ‫إعداد‬ ‫العلمية‬ ‫المادة‬ / ‫أ‬ . ‫ا‬ َّ‫الفــر‬ ‫رفيق‬ ‫محمود‬  The Hashtable class is a special type of Dictionary object, storing key–value pairs, where the values are stored based on the hash code derived from the key.  The strategy the class uses to avoid collisions is the concept of a bucket.
  • 17.
    The Hashtable class 17 ‫البيانات‬‫تراكيب‬ ‫مساق‬ ‫إعداد‬ ‫العلمية‬ ‫المادة‬ / ‫أ‬ . ‫ا‬ َّ‫الفــر‬ ‫رفيق‬ ‫محمود‬  A bucket is a virtual grouping of objects together that have the same hash code.  If two keys have the same hash code, they are placed in the same bucket.  Otherwise, each key with a unique hash code is placed in its own bucket.  The number of buckets used in a Hashtable objects is called the load factor.
  • 18.
    Instantiating a Hashtable Object 18 ‫البيانات‬‫تراكيب‬ ‫مساق‬ ‫إعداد‬ ‫العلمية‬ ‫المادة‬ / ‫أ‬ . ‫ا‬ َّ‫الفــر‬ ‫رفيق‬ ‫محمود‬  The Hashtable class is part of the System.Collections namespace, so you must import System.Collections at the beginning of your program.  A Hashtable object can be instantiated in one of three ways.  You can instantiate the hash table with an initial capacity or by using the default capacity.
  • 19.
    Instantiating a Hashtable Object 19 The following code demonstrates how to use these three constructors: Hashtable symbols = new Hashtable(); HashTable symbols = new Hashtable(50); HashTable symbols = new Hashtable(25, 3.0); The ratio of the number of elements in the hash table to the table size is called the load factor. Capacity (# of elements)
  • 20.
    Adding Data toa Hashtable Object 20 ‫البيانات‬ ‫تراكيب‬ ‫مساق‬ ‫إعداد‬ ‫العلمية‬ ‫المادة‬ / ‫أ‬ . ‫ا‬ َّ‫الفــر‬ ‫رفيق‬ ‫محمود‬  Key–value pairs are entered into a hash table using the Add method.  This method takes two arguments: 1. The key 2. The value associated with the key.  The key is added to the hash table after computing its hash value.
  • 21.
    Adding Data toa Hashtable Object 21  Here is some example code: Hashtable symbols = new Hashtable(25); symbols.Add("salary", 100000); symbols.Add("name", "David Durr"); symbols.Add("age", 43); symbols.Add("dept", "Information Technology");
  • 22.
    Thank You … 22 Rememberthat: question is the key of knowledge
  • 23.
    Ahl Eljanna  ‫السموات‬‫عرضها‬ ‫وجنة‬ ‫ربكم‬ ‫من‬ ‫مغفرة‬ ‫إىل‬ ‫وسارعوا‬ ‫للمتقني‬ ‫أعدت‬ ‫واألرض‬ 23 ‫ق‬ ‫سورة‬