SlideShare a Scribd company logo
1 of 50
DATA STRUCTURES AND ALGORITHMS
MODULE-5
HASHING TECHNIQUES
Section –A ; Group A5
Presented by:-
Avnish Jha
1901027
Piyush Harsh
1901062
Dhiraj Pandey
1901032
Deepak Kumar
1901031
Abhas Kumar
1901002
TOPIC'S
Hashing techniques
Hash functions
Common hashing functions and collision resolution
Linear probing
Quadratic probing
Double hashing
Bucket addressing
Rehashing
WHAT IS HASHING?
Hashing is an algorithm (via a hash function) that maps large data sets of
variable length, called keys, to smaller data sets of a fixed length.
A hash table (or hash map) is a data structure that uses a hash function to
efficiently map keys to values, for efficient search and retrieval.
Widely used in many kinds of computer software, particularly for associative
arrays, database indexing, caches, and sets
Different data structures to realize a key
Binary Tree
Array , Linked List
AVL Tree
B-Tree
Hash Table
Hash Table
A hash table is a data structure that stores elements and allows insertions, lookups, and deletions
to be performed in O(1) time.
A hash table is an alternative method for representing a dictionary
In a hash table, a hash function is used to map keys into positions in a table. This act is
called hashing
Hash Table Operations
Search: compute f(k) and see if a pair exists
Insert: compute f(k) and place it in that position
Delete: compute f(k) and delete the pair in that position
In ideal situation, hash table search, insert or delete takes (1)
How Does it Work?
The hash table part is just an ordinary array, it is the Hash that we are interested in.
The Hash is a function that transforms a key into address or index of array(table) where the record will
be stored. If the size of the table is N, then the integer will be in the range 0 to N -1. The integer is used
as an index into the arr ay. Thus, in essence, the key itself indexes the array.
If h is a hash function and k is key then h(k) is called the hash of the key and is the index at which a
record with the key k should be placed.
The hash function generates this address by performing some simple arithmetic or logical operations
on the key.
Why Hashing?
The sequential search algorithm takes time proportional to the data size, i.e, O (n).
Binary search improves on liner search reducing the search time to
O (log n).
With a BST, an O (log n) search efficiency can be obtained; but the worst-case complexity is O (n).
To guarantee the O(log n) search time, BST height balancing is required ( i.e., AVL trees).
Why Hashing?
Suppose that we want to store 10,000 students records (each with a 5-digit ID) in a given container.
A linked list implementation would take O (n) time.
A height balanced tree would give O (log n) access time.
Using an array of size 100,000 would give O (1) access time but will lead to a lot of space wastage.
Is there some way that we could get O (1) access without wasting a lot of space?
Yes, the answer is hashing.
What is Hash Function?
Suppose we have a hash table of size N.
Keys are used to identify the data .
A hash function is used to compute a hash value.
A hash value (hash code) is:
Computed from the key with the use of a hash function to get a number in the
range 0 to N − 1
Used as the index (address) of the table entry for the data
Regarded as the “home address” of a key .
Desire: The addresses are different and spread evenly over the range
When two keys have same hash value — collision
Good Hash Functions
Fast to compute, O( 1 )
Scatter keys evenly throughout the hash table
Less collisions
Need less slots (space)
The hash function uses all the input data.
The hash function generates very different hash values
for similar strings.
Perfect Hash Functions
Perfect hash function is a one-to-one mapping between keys and hash values. So no collision
occurs .
Possible if all keys are known.
Applications: compiler and interpreter search for reserved words; shell interpreter searches
for built-in commands.
Minimal perfect hash function: The table size is the same as the number of keywords supplied
.
What is Linear Probing?
In this section we will see what is linear probing technique in open addressing scheme.
There is an ordinary hash function h´(x) : U → {0, 1, . . ., m – 1}.
In open addressing scheme, the actual hash function h(x) is taking the ordinary hash function
h’(x) and attach some another part with it to make one linear equation.
Suppose we have a list of size 20 (m = 20). We want to put some elements in linear probing
fashion. The elements are {96, 48, 63, 29, 87, 77, 48, 65, 69, 94, 61}
Hash Table
Linear probing, we linearly probe for next slot. For example, the
typical gap between two probes is 1 as taken in below example
also.
Let us consider a simple hash function as “key mod 7”
and sequence of keys as 50, 700, 76, 85, 92, 73, 101.
Challenges in Linear Probing
1. Primary Clustering: One of the problems with linear probing is
Primary clustering, many consecutive elements form groups and
it starts taking time to find a free slot or to search an element.
2. Secondary Clustering: Secondary clustering is less severe, two
records do only have the same collision chain (Probe Sequence)
if their initial position is the same.
What is Double Hashing?
Double hashing technique in open addressing scheme.
There is an ordinary hash function h´(x) : U → {0, 1, . . ., m – 1}.
In open addressing scheme, the actual hash function h(x) is taking the ordinary hash function
h’(x) when the space is not empty ,then perform another hash function tp get some space to
insert.
h1(x)=xmodmh1(x)=xmod m
h2(x)=xmodm′h2(x)=xmod m′
h(x,i)=(h1(x)+ih𝑥2)mod m
The value of i = 0, 1, . . ., m – 1. So we start from i = 0, and increase this until we get one
free space. So initially when i = 0, then the h(x, i) is same as h´(x).
What is Double Hashing?
Suppose we have a list of size 20 (m = 20).
We want to put some elements in linear probing fashion.
The elements are {96, 48, 63, 29, 87, 77, 48, 65, 69, 94, 61}
h1(x)=xmod20h1(x)=xmod20
h2(x)=xmod13h2(x)=xmod13
x h(x ,i) = (h1 (x) + ih2(x)) mod 20
Hash Table
COMMON HASHING FUNCTIONS
Some common hashing algorithms include:
MD5 (Message Digest algorithm)
SHA-1 (Secure Hash Algorithm-1)
SHA-2 (Secure Hash Algorithm-2)
NTLM (NT LAN Manager)
LANMAN.( LAN Manager)
COLLISION
● Since a hash function gets us a small number for a key which is a big integer or string,
there is a possibility that two keys result in the same value.
● The situation where a newly inserted key maps to an already occupied slot in the hash
table is called collision.
● Collision must be handled for efficient implementation and performance of hash functions
and for us to perform the basic operations of searching, adding and deletion.
Example
A typical example of collision is shown in the image below where keys map to the same hash
value after calculation by the hash function.
Collision resolution
There are mainly two methods to handle collision:
1) Separate Chaining: The idea is to make each cell of hash table point to a linked list of
records that have same hash function value.
2) Open Addressing:. In Open Addressing, all elements are stored in the hash table itself. So
at any point, the size of the table must be greater than or equal to the total number of keys.
Collision Resolution by Chaining.
● In chaining, each location in a hash table stores a pointer to a linked list that
contains all the key values that were hashed to that location.
● That is, location l in the hash table points to the head of the linked list of all the
key values that hashed to l. However, if no key value hashes to l, then location l
in the hash table contains NULL.
● Figure below shows how the key values are mapped to a location in the hash
table and stored in a linked list that corresponds to that location.
Chaining diagram with example.
Operations on a Chained Hash Table
• Searching for a value in a chained hash table is as simple as scanning a linked list for an
entry with the given key.
• Insertion operation appends the key to the end of the linked list pointed by the hashed
location.
• Deleting a key requires searching the list and removing the element.
• Chained hash tables with linked lists are widely used due to the simplicity of the algorithms
to insert, delete, and search a key.
Efficiency:
• The time complexity of inserting a key in a chained hash table is O(1).
• The cost of deleting and searching a value is given as O(m) where m is the number of
elements in the list of that location.
• Searching and deleting takes more time because these operations scan the entries of the
selected location for the desired key.
• In the worst case, searching a value may take a running time of O(n), where n is the
number of key values stored in the chained hash table.
• This case arises when all the key values are inserted into the linked list of the same
location (of the hash table).
Code to initialise chained hash table:
typedef struct node_HashTable {
int value;
struct node *next;
}node;
void initialiseHashTable (node *hash_table[], int m)
{ int i;
for(i=0i<=m;i++)
hash_table[i]=NULL;
}
Time complexity: O(m)
Code to insert a value
/* The element is inserted at the beginning of the linked list whose pointer to its head is
stored in the location given by h(k). The running time of the insert operation is O(1), as the
new key value is always added as the first element of the list .*/
node *insert_value( node *hash_table[], int val)
{ node *new_node;
new_node = (node *)malloc(sizeof(node));
new_node value = val;
new_node next = hash_ table[h(x)];
hash_table[h(x)] = new_node;
}
Searching a value:
The element is searched in the linked list whose pointer to its head is stored in the location
given by h(k).
If search is successful, the function returns a pointer to the node in the linked list; otherwise
it returns NULL.
The worst case running time of the search operation is given as order of size of the linked
list.
Code to search a value
node *search_value(node *hash_table[], int val)
{
node *ptr; ptr = hash_table[h(x)];
while ( (ptr!=NULL) && (ptr –> value != val)){
ptr = ptr –> next;
}
if (ptr–>value == val) return ptr;
else return NULL;
}
Deleting a value:
● To delete a node from the linked list whose head is stored at the location given by h(k) in
the hash table, we need to know the address of the node’s predecessor.
● To do this we need a pointer saver.
● The running time complexity of the delete operation is same as that of the search
operation because we need to search the predecessor of the node so that the node can be
removed without affecting other nodes in the list.
Code to delete a value
void delete_value (node *hash_table[], int val)
{
node *save, *ptr;
save = NULL;
ptr = hash_table[h(x)];
while ((ptr != NULL) && (ptr value != val))
{
save = ptr; ptr = ptr next;
}
if (ptr != NULL)
{ save next = ptr next;
free (ptr);
} else
printf("n VALUE NOT FOUND"); }
Advantages of chaining
• Simple to implement.
• Hash table never fills up, we can always add more elements to the chain.
• Less sensitive to the hash function or load factors.
• It is mostly used when it is unknown how many and how frequently keys may be inserted or
deleted.
Disadvantages of chaining
● Cache performance of chaining is not good as keys are stored using a linked list. Open
addressing provides better cache performance as everything is stored in the same table.
● Wastage of Space (Some Parts of hash table are never used).
● If the chain becomes long, then search time can become O(n) in the worst case.
● Uses extra space for links.
Open addressing technique:
• In Open Addressing, all elements are stored in the hash table itself. So at any point, the
size of the table must be greater than or equal to the total number of keys.
• Insert(k): Keep probing until an empty slot is found. Once an empty slot is found, insert k.
• Search(k): Keep probing until slot’s key doesn’t become equal to k or an empty slot is
reached.
• Delete(k): If we simply delete a key, then the search may fail. So slots of deleted keys are
marked specially as “deleted”.
• The insert can insert an item in a deleted slot, but the search doesn’t stop at a deleted
slot.
Hash Buckets:
In computing, a hash table [hash map] is a data structure that provides virtually direct
access to objects based on a key [a unique String or Integer]. A hash table uses a hash
function to compute an index into an array of buckets or slots, from which the desired
value can be found. Here are the main features of the key used:
● The key used can be your SSN, your telephone number, account number, etc
● Must have unique keys
● Each key is associated with–mapped to–a value
● Hash buckets are used to apportion data items for sorting or lookup purposes. The aim of this
work is to weaken the linked lists so that searching for a specific item can be accessed within a
shorter time frame
Hash Buckets:
• In case a bucket is completely full, the record will get stored in an
overflow bucket of infinite capacity at the end of the table.
• All buckets share the same overflow bucket
However, a good implementation will use a hash function that distributes
the records evenly among the buckets so that as few records as possible go
into the overflow bucket.
.
Bucket Hashing:
Closed hashing stores all records directly in the hash table. Each record R with key
value kR has a home position that is h(kR), the slot computed by the hash function.
If R is to be inserted and another record already occupies R's home position, then R
will be stored at some other slot in the table.
It is the business of the collision resolution policy to determine which slot that will be.
Naturally, the same policy must be followed during search as during insertion, so that
any record not found in its home position can be recovered by repeating the collision
resolution process.
.
Hash Bucket:
One implementation for closed hashing groups hash table slots into buckets. The M slots of the hash
table are divided into B buckets, with each bucket consisting of M/B slots. The hash function assigns
each record to the first slot within one of the buckets.
If this slot is already occupied, then the bucket slots are searched sequentially until an open slot is
found. If a bucket is entirely full, then the record is stored in an overflow bucket of infinite capacity at
the end of the table. All buckets share the same overflow bucket.
A good implementation will use a hash function that distributes the records evenly among the buckets
so that as few records as possible go into the overflow bucket.
When searching for a record, the first step is to hash the key to determine which bucket should contain
the record. The records in this bucket are then searched. If the desired key value is not found and the
bucket still has free slots, then the search is complete.
.
Hash Buckets:
If the bucket is full, then it is possible that the desired record is
stored in the overflow bucket.
In this case, the overflow bucket must be searched until the record
is found or all records in the overflow bucket have been checked. If
many records are in the overflow bucket, this will be an expensive
process.
.
Methods:-
Bucket methods are good for implementing hash tables stored on disk, because
the bucket size can be set to the size of a disk block. Whenever search or
insertion occurs, the entire bucket is read into memory. Because the entire
bucket is then in memory, processing an insert or search operation requires only
one disk access, unless the bucket is full. If the bucket is full, then the overflow
bucket must be retrieved from disk as well. Naturally, overflow should be kept
small to minimize unnecessary disk accesses.
.
Collision Resolution
Bucket hashing is treating the hash table as a two dimensional array instead of a
linear array.
Consider a hash table with S slots that are divided into B buckets, with each
bucket consisting of S/B slots. The hash function assigns each record to the first
slot within one of the buckets. If the slot was already occupied then the bucket
slots are searched sequentially until an empty slot is found. If the bucket is
completely full, the record will be stored in an overflow bucket of infinite
capacity at the end of the table, which is shared by all buckets. Which makes
bucket hashing a form of closed hashing implementation. An ideal implementation
will use a hash function that distributes the records evenly among all buckets so
there will be as few records as possible to store in the overflow bucket.
.
.
Collision Resolution
Given this bucket hash table for an array of size 10 storing 5
buckets, each bucket having two slots in size, let's demonstrate
how this method works in practice. We also have an overflow
bucket of infinite size on the right to store records when the
buckets in the main hash table are occupied. I will be using mod
operation as the hash function.
.
.
Collision Resolution
. Let us start by inserting the number 18 as our first record. Since we
have 5 buckets, we take mod 5. 18 % 5 is 3. We put this into the top of
B3, which is slot 6 of the hash table.
Now inserting a record for 30. 30 % 5 is 0. 30 goes into B0[0].
Next we insert a record for 38; 38 % 5 is 3 so it will be placed in B3[1].
Next up we have 48. 48 % 5 is 3, but the B3 is already full, hence we
store 48 in the first available slot of our overflow bucket.
We can now try with 20. 20 % 5 is 0; B0[0] is occupied hence it will be
stored in B0[1].
Now if we insert 25, 25 % 5 is 0 and we know both slots of B0 are
occupied now, hence it will end up in our overflow bucket.
.
When looking for a record, we first take its hash value and search the resulting bucket.
If we search for key value 20, we search in B0, first checking B0[0] which holds a
different value, so we check B0[1] and we find our key.
When searching for the key value 25, we look in B0 sequentially. We see it doesn't hold
our key value and it is full, hence we look through the overflow bucket. First checking
OB[0], then OB[1] and we have found it.
Note that if there are many records in the overflow bucket, this will be an expensive
process.
Presentation.pptx

More Related Content

What's hot

BÀI TẬP DẠY THÊM TOÁN 11 CẢ NĂM - KẾT NỐI TRI THỨC - NĂM 2024 (LÝ THUYẾT, BÀI...
BÀI TẬP DẠY THÊM TOÁN 11 CẢ NĂM - KẾT NỐI TRI THỨC - NĂM 2024 (LÝ THUYẾT, BÀI...BÀI TẬP DẠY THÊM TOÁN 11 CẢ NĂM - KẾT NỐI TRI THỨC - NĂM 2024 (LÝ THUYẾT, BÀI...
BÀI TẬP DẠY THÊM TOÁN 11 CẢ NĂM - KẾT NỐI TRI THỨC - NĂM 2024 (LÝ THUYẾT, BÀI...Nguyen Thanh Tu Collection
 
[LondonSEO 2020] BigQuery & SQL for SEOs
[LondonSEO 2020] BigQuery & SQL for SEOs[LondonSEO 2020] BigQuery & SQL for SEOs
[LondonSEO 2020] BigQuery & SQL for SEOsAreej AbuAli
 
Crafting Expertise, Authority and Trust with Entity-Based Content Strategy - ...
Crafting Expertise, Authority and Trust with Entity-Based Content Strategy - ...Crafting Expertise, Authority and Trust with Entity-Based Content Strategy - ...
Crafting Expertise, Authority and Trust with Entity-Based Content Strategy - ...Jamie Indigo
 
BÀI TẬP DẠY THÊM TOÁN 11 - CHÂN TRỜI SÁNG TẠO - CẢ NĂM - CHƯƠNG 4 - ĐƯỜNG THẲ...
BÀI TẬP DẠY THÊM TOÁN 11 - CHÂN TRỜI SÁNG TẠO - CẢ NĂM - CHƯƠNG 4 - ĐƯỜNG THẲ...BÀI TẬP DẠY THÊM TOÁN 11 - CHÂN TRỜI SÁNG TẠO - CẢ NĂM - CHƯƠNG 4 - ĐƯỜNG THẲ...
BÀI TẬP DẠY THÊM TOÁN 11 - CHÂN TRỜI SÁNG TẠO - CẢ NĂM - CHƯƠNG 4 - ĐƯỜNG THẲ...Nguyen Thanh Tu Collection
 
SEO - On Page | Off Page | Deep Linking | Link Building | Articles |Blogs | C...
SEO - On Page | Off Page | Deep Linking | Link Building | Articles |Blogs | C...SEO - On Page | Off Page | Deep Linking | Link Building | Articles |Blogs | C...
SEO - On Page | Off Page | Deep Linking | Link Building | Articles |Blogs | C...Harjeet Dhillon
 
Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...GenomeInABottle
 
De novo genome assembly - IMB Winter School - 7 July 2015
De novo genome assembly - IMB Winter School - 7 July 2015De novo genome assembly - IMB Winter School - 7 July 2015
De novo genome assembly - IMB Winter School - 7 July 2015Torsten Seemann
 
Build a High-Impact SEO Strategy in 2022
Build a High-Impact SEO Strategy in 2022Build a High-Impact SEO Strategy in 2022
Build a High-Impact SEO Strategy in 2022ALPSMarketing
 

What's hot (11)

BÀI TẬP DẠY THÊM TOÁN 11 CẢ NĂM - KẾT NỐI TRI THỨC - NĂM 2024 (LÝ THUYẾT, BÀI...
BÀI TẬP DẠY THÊM TOÁN 11 CẢ NĂM - KẾT NỐI TRI THỨC - NĂM 2024 (LÝ THUYẾT, BÀI...BÀI TẬP DẠY THÊM TOÁN 11 CẢ NĂM - KẾT NỐI TRI THỨC - NĂM 2024 (LÝ THUYẾT, BÀI...
BÀI TẬP DẠY THÊM TOÁN 11 CẢ NĂM - KẾT NỐI TRI THỨC - NĂM 2024 (LÝ THUYẾT, BÀI...
 
[LondonSEO 2020] BigQuery & SQL for SEOs
[LondonSEO 2020] BigQuery & SQL for SEOs[LondonSEO 2020] BigQuery & SQL for SEOs
[LondonSEO 2020] BigQuery & SQL for SEOs
 
Crafting Expertise, Authority and Trust with Entity-Based Content Strategy - ...
Crafting Expertise, Authority and Trust with Entity-Based Content Strategy - ...Crafting Expertise, Authority and Trust with Entity-Based Content Strategy - ...
Crafting Expertise, Authority and Trust with Entity-Based Content Strategy - ...
 
BÀI TẬP DẠY THÊM TOÁN 11 - CHÂN TRỜI SÁNG TẠO - CẢ NĂM - CHƯƠNG 4 - ĐƯỜNG THẲ...
BÀI TẬP DẠY THÊM TOÁN 11 - CHÂN TRỜI SÁNG TẠO - CẢ NĂM - CHƯƠNG 4 - ĐƯỜNG THẲ...BÀI TẬP DẠY THÊM TOÁN 11 - CHÂN TRỜI SÁNG TẠO - CẢ NĂM - CHƯƠNG 4 - ĐƯỜNG THẲ...
BÀI TẬP DẠY THÊM TOÁN 11 - CHÂN TRỜI SÁNG TẠO - CẢ NĂM - CHƯƠNG 4 - ĐƯỜNG THẲ...
 
Exome Sequencing
Exome SequencingExome Sequencing
Exome Sequencing
 
SEO - On Page | Off Page | Deep Linking | Link Building | Articles |Blogs | C...
SEO - On Page | Off Page | Deep Linking | Link Building | Articles |Blogs | C...SEO - On Page | Off Page | Deep Linking | Link Building | Articles |Blogs | C...
SEO - On Page | Off Page | Deep Linking | Link Building | Articles |Blogs | C...
 
Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...Genome in a Bottle- reference materials to benchmark challenging variants and...
Genome in a Bottle- reference materials to benchmark challenging variants and...
 
De novo genome assembly - IMB Winter School - 7 July 2015
De novo genome assembly - IMB Winter School - 7 July 2015De novo genome assembly - IMB Winter School - 7 July 2015
De novo genome assembly - IMB Winter School - 7 July 2015
 
Illumina sequencing introduction
Illumina sequencing introductionIllumina sequencing introduction
Illumina sequencing introduction
 
Basics of Genome Assembly
Basics of Genome Assembly Basics of Genome Assembly
Basics of Genome Assembly
 
Build a High-Impact SEO Strategy in 2022
Build a High-Impact SEO Strategy in 2022Build a High-Impact SEO Strategy in 2022
Build a High-Impact SEO Strategy in 2022
 

Similar to Presentation.pptx

Hashing.pptx
Hashing.pptxHashing.pptx
Hashing.pptxkratika64
 
Hashing using a different methods of technic
Hashing using a different methods of technicHashing using a different methods of technic
Hashing using a different methods of techniclokaprasaadvs
 
11_hashtable-1.ppt. Data structure algorithm
11_hashtable-1.ppt. Data structure algorithm11_hashtable-1.ppt. Data structure algorithm
11_hashtable-1.ppt. Data structure algorithmfarhankhan89766
 
Hashing Technique In Data Structures
Hashing Technique In Data StructuresHashing Technique In Data Structures
Hashing Technique In Data StructuresSHAKOOR AB
 
Sienna 9 hashing
Sienna 9 hashingSienna 9 hashing
Sienna 9 hashingchidabdu
 
Algorithms notes tutorials duniya
Algorithms notes   tutorials duniyaAlgorithms notes   tutorials duniya
Algorithms notes tutorials duniyaTutorialsDuniya.com
 
Algorithm chapter 7
Algorithm chapter 7Algorithm chapter 7
Algorithm chapter 7chidabdu
 
Data Structure and Algorithms: What is Hash Table ppt
Data Structure and Algorithms: What is Hash Table pptData Structure and Algorithms: What is Hash Table ppt
Data Structure and Algorithms: What is Hash Table pptJUSTFUN40
 
Lecture14_15_Hashing.pptx
Lecture14_15_Hashing.pptxLecture14_15_Hashing.pptx
Lecture14_15_Hashing.pptxSLekshmiNair
 
Skiena algorithm 2007 lecture06 sorting
Skiena algorithm 2007 lecture06 sortingSkiena algorithm 2007 lecture06 sorting
Skiena algorithm 2007 lecture06 sortingzukun
 
Concept of hashing
Concept of hashingConcept of hashing
Concept of hashingRafi Dar
 

Similar to Presentation.pptx (20)

Hashing.pptx
Hashing.pptxHashing.pptx
Hashing.pptx
 
Hashing using a different methods of technic
Hashing using a different methods of technicHashing using a different methods of technic
Hashing using a different methods of technic
 
11_hashtable-1.ppt. Data structure algorithm
11_hashtable-1.ppt. Data structure algorithm11_hashtable-1.ppt. Data structure algorithm
11_hashtable-1.ppt. Data structure algorithm
 
Hashing Technique In Data Structures
Hashing Technique In Data StructuresHashing Technique In Data Structures
Hashing Technique In Data Structures
 
Sienna 9 hashing
Sienna 9 hashingSienna 9 hashing
Sienna 9 hashing
 
Algorithms notes tutorials duniya
Algorithms notes   tutorials duniyaAlgorithms notes   tutorials duniya
Algorithms notes tutorials duniya
 
Hash pre
Hash preHash pre
Hash pre
 
Algorithm chapter 7
Algorithm chapter 7Algorithm chapter 7
Algorithm chapter 7
 
Data Structure and Algorithms: What is Hash Table ppt
Data Structure and Algorithms: What is Hash Table pptData Structure and Algorithms: What is Hash Table ppt
Data Structure and Algorithms: What is Hash Table ppt
 
Hashing
HashingHashing
Hashing
 
Hashing
HashingHashing
Hashing
 
Hash function
Hash functionHash function
Hash function
 
Hashing .pptx
Hashing .pptxHashing .pptx
Hashing .pptx
 
Lecture14_15_Hashing.pptx
Lecture14_15_Hashing.pptxLecture14_15_Hashing.pptx
Lecture14_15_Hashing.pptx
 
Hashing data
Hashing dataHashing data
Hashing data
 
Hashing PPT
Hashing PPTHashing PPT
Hashing PPT
 
Hashing
HashingHashing
Hashing
 
Unit viii searching and hashing
Unit   viii searching and hashing Unit   viii searching and hashing
Unit viii searching and hashing
 
Skiena algorithm 2007 lecture06 sorting
Skiena algorithm 2007 lecture06 sortingSkiena algorithm 2007 lecture06 sorting
Skiena algorithm 2007 lecture06 sorting
 
Concept of hashing
Concept of hashingConcept of hashing
Concept of hashing
 

Recently uploaded

Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxLigayaBacuel1
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 

Recently uploaded (20)

Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptx
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 

Presentation.pptx

  • 1. DATA STRUCTURES AND ALGORITHMS MODULE-5 HASHING TECHNIQUES Section –A ; Group A5 Presented by:- Avnish Jha 1901027 Piyush Harsh 1901062 Dhiraj Pandey 1901032 Deepak Kumar 1901031 Abhas Kumar 1901002
  • 2. TOPIC'S Hashing techniques Hash functions Common hashing functions and collision resolution Linear probing Quadratic probing Double hashing Bucket addressing Rehashing
  • 3. WHAT IS HASHING? Hashing is an algorithm (via a hash function) that maps large data sets of variable length, called keys, to smaller data sets of a fixed length. A hash table (or hash map) is a data structure that uses a hash function to efficiently map keys to values, for efficient search and retrieval. Widely used in many kinds of computer software, particularly for associative arrays, database indexing, caches, and sets
  • 4. Different data structures to realize a key Binary Tree Array , Linked List AVL Tree B-Tree Hash Table
  • 5. Hash Table A hash table is a data structure that stores elements and allows insertions, lookups, and deletions to be performed in O(1) time. A hash table is an alternative method for representing a dictionary In a hash table, a hash function is used to map keys into positions in a table. This act is called hashing Hash Table Operations Search: compute f(k) and see if a pair exists Insert: compute f(k) and place it in that position Delete: compute f(k) and delete the pair in that position In ideal situation, hash table search, insert or delete takes (1)
  • 6. How Does it Work? The hash table part is just an ordinary array, it is the Hash that we are interested in. The Hash is a function that transforms a key into address or index of array(table) where the record will be stored. If the size of the table is N, then the integer will be in the range 0 to N -1. The integer is used as an index into the arr ay. Thus, in essence, the key itself indexes the array. If h is a hash function and k is key then h(k) is called the hash of the key and is the index at which a record with the key k should be placed. The hash function generates this address by performing some simple arithmetic or logical operations on the key.
  • 7. Why Hashing? The sequential search algorithm takes time proportional to the data size, i.e, O (n). Binary search improves on liner search reducing the search time to O (log n). With a BST, an O (log n) search efficiency can be obtained; but the worst-case complexity is O (n). To guarantee the O(log n) search time, BST height balancing is required ( i.e., AVL trees).
  • 8. Why Hashing? Suppose that we want to store 10,000 students records (each with a 5-digit ID) in a given container. A linked list implementation would take O (n) time. A height balanced tree would give O (log n) access time. Using an array of size 100,000 would give O (1) access time but will lead to a lot of space wastage. Is there some way that we could get O (1) access without wasting a lot of space? Yes, the answer is hashing.
  • 9. What is Hash Function? Suppose we have a hash table of size N. Keys are used to identify the data . A hash function is used to compute a hash value. A hash value (hash code) is: Computed from the key with the use of a hash function to get a number in the range 0 to N − 1 Used as the index (address) of the table entry for the data Regarded as the “home address” of a key . Desire: The addresses are different and spread evenly over the range When two keys have same hash value — collision
  • 10. Good Hash Functions Fast to compute, O( 1 ) Scatter keys evenly throughout the hash table Less collisions Need less slots (space) The hash function uses all the input data. The hash function generates very different hash values for similar strings.
  • 11. Perfect Hash Functions Perfect hash function is a one-to-one mapping between keys and hash values. So no collision occurs . Possible if all keys are known. Applications: compiler and interpreter search for reserved words; shell interpreter searches for built-in commands. Minimal perfect hash function: The table size is the same as the number of keywords supplied .
  • 12. What is Linear Probing? In this section we will see what is linear probing technique in open addressing scheme. There is an ordinary hash function h´(x) : U → {0, 1, . . ., m – 1}. In open addressing scheme, the actual hash function h(x) is taking the ordinary hash function h’(x) and attach some another part with it to make one linear equation. Suppose we have a list of size 20 (m = 20). We want to put some elements in linear probing fashion. The elements are {96, 48, 63, 29, 87, 77, 48, 65, 69, 94, 61}
  • 13.
  • 14. Hash Table Linear probing, we linearly probe for next slot. For example, the typical gap between two probes is 1 as taken in below example also.
  • 15. Let us consider a simple hash function as “key mod 7” and sequence of keys as 50, 700, 76, 85, 92, 73, 101.
  • 16. Challenges in Linear Probing 1. Primary Clustering: One of the problems with linear probing is Primary clustering, many consecutive elements form groups and it starts taking time to find a free slot or to search an element. 2. Secondary Clustering: Secondary clustering is less severe, two records do only have the same collision chain (Probe Sequence) if their initial position is the same.
  • 17. What is Double Hashing? Double hashing technique in open addressing scheme. There is an ordinary hash function h´(x) : U → {0, 1, . . ., m – 1}. In open addressing scheme, the actual hash function h(x) is taking the ordinary hash function h’(x) when the space is not empty ,then perform another hash function tp get some space to insert. h1(x)=xmodmh1(x)=xmod m h2(x)=xmodm′h2(x)=xmod m′ h(x,i)=(h1(x)+ih𝑥2)mod m The value of i = 0, 1, . . ., m – 1. So we start from i = 0, and increase this until we get one free space. So initially when i = 0, then the h(x, i) is same as h´(x).
  • 18. What is Double Hashing? Suppose we have a list of size 20 (m = 20). We want to put some elements in linear probing fashion. The elements are {96, 48, 63, 29, 87, 77, 48, 65, 69, 94, 61} h1(x)=xmod20h1(x)=xmod20 h2(x)=xmod13h2(x)=xmod13 x h(x ,i) = (h1 (x) + ih2(x)) mod 20
  • 19.
  • 21. COMMON HASHING FUNCTIONS Some common hashing algorithms include: MD5 (Message Digest algorithm) SHA-1 (Secure Hash Algorithm-1) SHA-2 (Secure Hash Algorithm-2) NTLM (NT LAN Manager) LANMAN.( LAN Manager)
  • 22. COLLISION ● Since a hash function gets us a small number for a key which is a big integer or string, there is a possibility that two keys result in the same value. ● The situation where a newly inserted key maps to an already occupied slot in the hash table is called collision. ● Collision must be handled for efficient implementation and performance of hash functions and for us to perform the basic operations of searching, adding and deletion.
  • 23. Example A typical example of collision is shown in the image below where keys map to the same hash value after calculation by the hash function.
  • 24. Collision resolution There are mainly two methods to handle collision: 1) Separate Chaining: The idea is to make each cell of hash table point to a linked list of records that have same hash function value. 2) Open Addressing:. In Open Addressing, all elements are stored in the hash table itself. So at any point, the size of the table must be greater than or equal to the total number of keys.
  • 25. Collision Resolution by Chaining. ● In chaining, each location in a hash table stores a pointer to a linked list that contains all the key values that were hashed to that location. ● That is, location l in the hash table points to the head of the linked list of all the key values that hashed to l. However, if no key value hashes to l, then location l in the hash table contains NULL. ● Figure below shows how the key values are mapped to a location in the hash table and stored in a linked list that corresponds to that location.
  • 27. Operations on a Chained Hash Table • Searching for a value in a chained hash table is as simple as scanning a linked list for an entry with the given key. • Insertion operation appends the key to the end of the linked list pointed by the hashed location. • Deleting a key requires searching the list and removing the element. • Chained hash tables with linked lists are widely used due to the simplicity of the algorithms to insert, delete, and search a key.
  • 28. Efficiency: • The time complexity of inserting a key in a chained hash table is O(1). • The cost of deleting and searching a value is given as O(m) where m is the number of elements in the list of that location. • Searching and deleting takes more time because these operations scan the entries of the selected location for the desired key. • In the worst case, searching a value may take a running time of O(n), where n is the number of key values stored in the chained hash table. • This case arises when all the key values are inserted into the linked list of the same location (of the hash table).
  • 29. Code to initialise chained hash table: typedef struct node_HashTable { int value; struct node *next; }node; void initialiseHashTable (node *hash_table[], int m) { int i; for(i=0i<=m;i++) hash_table[i]=NULL; } Time complexity: O(m)
  • 30. Code to insert a value /* The element is inserted at the beginning of the linked list whose pointer to its head is stored in the location given by h(k). The running time of the insert operation is O(1), as the new key value is always added as the first element of the list .*/ node *insert_value( node *hash_table[], int val) { node *new_node; new_node = (node *)malloc(sizeof(node)); new_node value = val; new_node next = hash_ table[h(x)]; hash_table[h(x)] = new_node; }
  • 31. Searching a value: The element is searched in the linked list whose pointer to its head is stored in the location given by h(k). If search is successful, the function returns a pointer to the node in the linked list; otherwise it returns NULL. The worst case running time of the search operation is given as order of size of the linked list.
  • 32. Code to search a value node *search_value(node *hash_table[], int val) { node *ptr; ptr = hash_table[h(x)]; while ( (ptr!=NULL) && (ptr –> value != val)){ ptr = ptr –> next; } if (ptr–>value == val) return ptr; else return NULL; }
  • 33. Deleting a value: ● To delete a node from the linked list whose head is stored at the location given by h(k) in the hash table, we need to know the address of the node’s predecessor. ● To do this we need a pointer saver. ● The running time complexity of the delete operation is same as that of the search operation because we need to search the predecessor of the node so that the node can be removed without affecting other nodes in the list.
  • 34. Code to delete a value void delete_value (node *hash_table[], int val) { node *save, *ptr; save = NULL; ptr = hash_table[h(x)]; while ((ptr != NULL) && (ptr value != val)) { save = ptr; ptr = ptr next; } if (ptr != NULL) { save next = ptr next; free (ptr); } else printf("n VALUE NOT FOUND"); }
  • 35. Advantages of chaining • Simple to implement. • Hash table never fills up, we can always add more elements to the chain. • Less sensitive to the hash function or load factors. • It is mostly used when it is unknown how many and how frequently keys may be inserted or deleted.
  • 36. Disadvantages of chaining ● Cache performance of chaining is not good as keys are stored using a linked list. Open addressing provides better cache performance as everything is stored in the same table. ● Wastage of Space (Some Parts of hash table are never used). ● If the chain becomes long, then search time can become O(n) in the worst case. ● Uses extra space for links.
  • 37. Open addressing technique: • In Open Addressing, all elements are stored in the hash table itself. So at any point, the size of the table must be greater than or equal to the total number of keys. • Insert(k): Keep probing until an empty slot is found. Once an empty slot is found, insert k. • Search(k): Keep probing until slot’s key doesn’t become equal to k or an empty slot is reached. • Delete(k): If we simply delete a key, then the search may fail. So slots of deleted keys are marked specially as “deleted”. • The insert can insert an item in a deleted slot, but the search doesn’t stop at a deleted slot.
  • 38. Hash Buckets: In computing, a hash table [hash map] is a data structure that provides virtually direct access to objects based on a key [a unique String or Integer]. A hash table uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found. Here are the main features of the key used: ● The key used can be your SSN, your telephone number, account number, etc ● Must have unique keys ● Each key is associated with–mapped to–a value ● Hash buckets are used to apportion data items for sorting or lookup purposes. The aim of this work is to weaken the linked lists so that searching for a specific item can be accessed within a shorter time frame
  • 39.
  • 40. Hash Buckets: • In case a bucket is completely full, the record will get stored in an overflow bucket of infinite capacity at the end of the table. • All buckets share the same overflow bucket However, a good implementation will use a hash function that distributes the records evenly among the buckets so that as few records as possible go into the overflow bucket. .
  • 41. Bucket Hashing: Closed hashing stores all records directly in the hash table. Each record R with key value kR has a home position that is h(kR), the slot computed by the hash function. If R is to be inserted and another record already occupies R's home position, then R will be stored at some other slot in the table. It is the business of the collision resolution policy to determine which slot that will be. Naturally, the same policy must be followed during search as during insertion, so that any record not found in its home position can be recovered by repeating the collision resolution process. .
  • 42. Hash Bucket: One implementation for closed hashing groups hash table slots into buckets. The M slots of the hash table are divided into B buckets, with each bucket consisting of M/B slots. The hash function assigns each record to the first slot within one of the buckets. If this slot is already occupied, then the bucket slots are searched sequentially until an open slot is found. If a bucket is entirely full, then the record is stored in an overflow bucket of infinite capacity at the end of the table. All buckets share the same overflow bucket. A good implementation will use a hash function that distributes the records evenly among the buckets so that as few records as possible go into the overflow bucket. When searching for a record, the first step is to hash the key to determine which bucket should contain the record. The records in this bucket are then searched. If the desired key value is not found and the bucket still has free slots, then the search is complete. .
  • 43. Hash Buckets: If the bucket is full, then it is possible that the desired record is stored in the overflow bucket. In this case, the overflow bucket must be searched until the record is found or all records in the overflow bucket have been checked. If many records are in the overflow bucket, this will be an expensive process. .
  • 44. Methods:- Bucket methods are good for implementing hash tables stored on disk, because the bucket size can be set to the size of a disk block. Whenever search or insertion occurs, the entire bucket is read into memory. Because the entire bucket is then in memory, processing an insert or search operation requires only one disk access, unless the bucket is full. If the bucket is full, then the overflow bucket must be retrieved from disk as well. Naturally, overflow should be kept small to minimize unnecessary disk accesses. .
  • 45. Collision Resolution Bucket hashing is treating the hash table as a two dimensional array instead of a linear array. Consider a hash table with S slots that are divided into B buckets, with each bucket consisting of S/B slots. The hash function assigns each record to the first slot within one of the buckets. If the slot was already occupied then the bucket slots are searched sequentially until an empty slot is found. If the bucket is completely full, the record will be stored in an overflow bucket of infinite capacity at the end of the table, which is shared by all buckets. Which makes bucket hashing a form of closed hashing implementation. An ideal implementation will use a hash function that distributes the records evenly among all buckets so there will be as few records as possible to store in the overflow bucket. . .
  • 46. Collision Resolution Given this bucket hash table for an array of size 10 storing 5 buckets, each bucket having two slots in size, let's demonstrate how this method works in practice. We also have an overflow bucket of infinite size on the right to store records when the buckets in the main hash table are occupied. I will be using mod operation as the hash function. . .
  • 47. Collision Resolution . Let us start by inserting the number 18 as our first record. Since we have 5 buckets, we take mod 5. 18 % 5 is 3. We put this into the top of B3, which is slot 6 of the hash table. Now inserting a record for 30. 30 % 5 is 0. 30 goes into B0[0]. Next we insert a record for 38; 38 % 5 is 3 so it will be placed in B3[1]. Next up we have 48. 48 % 5 is 3, but the B3 is already full, hence we store 48 in the first available slot of our overflow bucket. We can now try with 20. 20 % 5 is 0; B0[0] is occupied hence it will be stored in B0[1]. Now if we insert 25, 25 % 5 is 0 and we know both slots of B0 are occupied now, hence it will end up in our overflow bucket. .
  • 48.
  • 49. When looking for a record, we first take its hash value and search the resulting bucket. If we search for key value 20, we search in B0, first checking B0[0] which holds a different value, so we check B0[1] and we find our key. When searching for the key value 25, we look in B0 sequentially. We see it doesn't hold our key value and it is full, hence we look through the overflow bucket. First checking OB[0], then OB[1] and we have found it. Note that if there are many records in the overflow bucket, this will be an expensive process.