SlideShare a Scribd company logo
Unit – VIII
Searching and Hashing
Prepared By:
Dabbal Singh Mahara
1
2
• Introduction
• Sequential search
• Binary search
• Comparison and efficiency of searching
• Hashing
 probing (Linear and Quadratic)
Contents
3
Introduction
• Searching is a process of finding an element within the list of elements
stored in any order or randomly.
• Searching is divided into two categories Linear and Binary search.
• Linear search
 Small arrays
 Unsorted arrays
• Binary search
 Large arrays
 Sorted arrays
4
• In linear search, access each element of an array one by one
sequentially and see whether it is desired element or not. A search will
be unsuccessful if all the elements are accessed and the desired element
is not found.
• In brief, Simply search for the given element left to right and return the
index of the element, if found. Otherwise return “Not Found”.
Algorithm:
LinearSearch(A, n,key)
{
for(i=0;i<n;i++)
{
if(A[i] == key)
return i;
}
return -1; //-1 indicates unsuccessful search
}
Analysis: Time complexity = O(n)
Linear Search
5
Binary Search
• Binary search is an extremely efficient algorithm.
• This search technique searches the given item in minimum possible comparisons.
• To do this binary search, first we need to sort the a elements. The logic behind this
technique is given below:
i. First find the middle element of the array
ii. Compare the middle element with an item.
iii. There are three cases:
a. If it is a desired element then search is successful
b. If it is less than desired item then search only the first half of the
array.
c. If it is greater than the desired element, search in the second half of
the array.
• Repeat the same process until element is found or exhausts in the search area.
• In this algorithm every time we are reducing the search area.
6
Iterative Algorithm
BinarySearch(A, l, r, key)
{
while(l<=r)
{
m = (l + r) /2 ; //integer division
if(key = = A[m]
print " Search successful"
else if (key < A[m])
r = m - 1
else
l = m+1
}
If(l>r)
print "unsuccessful search"
}
7
Algorithm: Recursive
BinarySearch(A, l, r, key)
{
if(l= = r) //only one element
{
if(key = = A[l])
print " successful Search"
else
print "unsuccessful Search"
}
else
{
m = (l + r) /2 ; //integer division
if(key = = A[m]
print "successful search"
else if (key < A[m])
return BinarySearch(l, m-1, key) ;
else
return BinarySearch(m+1, r, key) ;
}
}
8
Binary Search Example
• Ex. Binary search for 33.
641413 25 33 5143 53 8472 93 95 97966
low high
Example Tracing
9
Running example:
Take input array a[]
For Search key = 2
l r mid remarks
0 13 6 Key < a[6] i.e. 2 < 53
0 5 2 Key < a[2] i.e. 2 < 7
0 1 0 Key == a[0] i.e. 2 ==a[0]
Therefore, key found at index 0.
Search Successful !!
2 5 7 9 18 45 53 59 67 72 88 95 101 104
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Exercise : Trace binary search algorithm for keys:
i. 67
ii. 50
iii. 250
10
Search for key = 67
l r mid Remarks
0 13 6 Key < a[6] i.e. 67 > 53
7 13 10 Key < a[10] i.e. 67 < 88
7 9 8 Key == a[8] i.e. 67 ==a[8]
Therefore, key found at index 8.
Search Successful !!
2 5 7 9 18 45 53 59 67 72 88 95 101 104
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Input Array : a[ ]
11
Search for key = 50
l r mid Remarks
0 13 6 Key < a[6] i.e. 50 < 53
0 5 2 Key < a[2] i.e. 50 > 7
3 5 4 Key > a[4] i.e. 50 >18
5 5 5 Key > a[5] i.e. 50 > 45
6 5 l > r, terminate
Therefore, key not found in the array.
Search Unsuccessful !!
2 5 7 9 18 45 53 59 67 72 88 95 101 104
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Given Input Array a[]
12
Efficiency:
From the above algorithm we can say that the running time of the
algorithm is:
T(n) = T(n/2) + Ο(1)
= Ο(log n)
 In the best case output is obtained at one run
i.e. Ο(1) time if the key is at middle.
 In the worst case the output is at the end of the array,
So running time is Ο(log n)
 In the average case also running time is Ο(log n).
13
Introduction to Hashing
• Suppose that we want to store 10,000 students records (each with a 5-digit ID) in a
given container.
 A linked list implementation would take O(n) time.
 A height balanced tree would give O(log n) access time.
 Using an array of size 100,000 would give O(1) access time but will lead
to a lot of space wastage.
• Is there some way that we could get O(1) access without wasting a lot of space?
• The answer is hashing.
14
Introduction to Hashing
 Hashing is a technique used for performing insertions, deletions and
finds in constant average time O(1).
 The techniques employed here is to compute location of desired
record to retrieve it in a single access or comparison.
 This data structure, however, is not efficient in operations that
require any ordering information among the elements, such as
findMin, findMax and printing the entire table in sorted order.
Applications
• Database systems
• Symbol table for compilers
• Data Dictionaries
• Browser caches
15
 The ideal hash table structure is an array of some fixed size,
containing the items.
 A stored item needs to have a data member, called key, that will be
used in computing the index value for the item.
• Key could be an integer, a string, etc
e.g. a name or Id that is a part of a large employee structure
 The size of the array is TableSize.
 The items that are stored in the hash table are indexed by values from
0 to TableSize – 1.
 Each key is mapped into some number in the range 0 to TableSize – 1.
 The mapping is called a hash function.
Hash Table
16
Example
Hash
Function
mary 28200
dave 27500
phil 31250
john 25000
Items
Hash
Table
key
key
0
1
2
3
4
5
6
7
8
9
mary 28200
dave 27500
phil 31250
john 25000
17
Hash Functions (cont’d)
• A hash function, h, is a function which transforms a key from a set, K,
into an index in a table of size n:
h: K -> {0, 1, ..., n-2, n-1}
• A key can be a number, a string, a record etc.
• The size of the set of keys, |K|, to be relatively very large.
• It is possible for different keys to hash to the same array location. This
situation is called collision and the colliding keys are called synonyms.
• A common hash function is
h(x)=x mod SIZE
• if key=27 and SIZE=10 then
hash address=27%10=7
18
• A good hash function should:
· Minimize collisions.
· Be easy and quick to compute.
· Distribute key values evenly in the hash table.
· Use all the information provided in the key.
19
Load Factor of a Hash Table
• Load factor of a hash table T:
 = n/m
– n = number of elements stored in the table
– m = number of slots in the table
•  encodes the average number of elements
stored in a chain
•  can be <, =, > 1
0
m - 1
T
chain
chain
chain
chain
20
Collision Resolution
• If, when an element is inserted, it hashes to the same
value as an already inserted element, then we have a
collision and need to resolve it.
i.e. For any two keys k1 and k2,
H(k1) = H(k2) = β
• There are several methods for dealing with this:
– Separate chaining
– Open addressing
• Linear Probing
• Quadratic Probing
• Double Hashing
21
Separate Chaining
• The idea is to keep a list of all elements that hash to
the same value.
– The array elements are pointers to the first nodes of the
lists.
– A new item is inserted to the front of the list.
• Advantages:
– Better space utilization for large items.
– Simple collision handling: searching linked list.
– Overflow: we can store more items than the hash table size.
– Deletion is quick and easy: deletion from the linked list.
22
Example
0
1
2
3
4
5
6
7
8
9
0
81 1
64 4
25
36 16
49 9
Keys: 0, 1, 4, 9, 16, 25, 36, 49, 64, 81
hash(key) = key % 10.
Exercise: Represent the keys {89, 18, 49, 58, 69, 78} in
hash table using separate chaining.
23
Operations
• Initialization: all entries are set to NULL
• Find:
– locate the cell using hash function.
– sequential search on the linked list in that cell.
• Insertion:
– Locate the cell using hash function.
– (If the item does not exist) insert it as the first item in
the list.
• Deletion:
– Locate the cell using hash function.
– Delete the item from the linked list.
24
Collision Resolution with
Open Addressing
• Separate chaining has the disadvantage of using
linked lists.
– Requires the implementation of a second data structure.
• In an open addressing hashing system, all the data go
inside the table.
– Thus, a bigger table is needed.
• Generally the load factor should be below 0.5.
– If a collision occurs, alternative cells are tried until an
empty cell is found.
25
Open Addressing
• More formally:
– Cells h0(x), h1(x), h2(x), …are tried in succession
where, hi(x) = (hash(x) + f(i)) mod TableSize, with f(0) = 0.
– The function f is the collision resolution strategy.
• There are three common collision resolution strategies:
– Linear Probing
– Quadratic probing
– Double hashing
26
Linear Probing
• In linear probing, collisions are resolved by sequentially
scanning an array (with wraparound) until an empty cell is
found.
• hi(x) = (hash(x) + f(i)) mod TableSize
– i.e. f is a linear function of i, typically f(i)= i.
 Example: Insert keys {89, 18, 49, 58, 69, 78} with the hash
function: h(x)=x mod 10 using linear probing. Use table size
10.
 when x=89:
h(89)=89%10=9
insert key 89 in hash-table in location 9
 when x=18:
h(18)=18%10=8
insert key 18 in hash-table in location 8
27
when x=49:
h(49)=49%10=9 (Collision )
so insert key 49 in hash-table in next
possible vacant location of 9 is 0
 when x=58:
h(58)=58%10=8 (Collision)
insert key 58 in hash-table in next
possible vacant location of 8 is 1
(since 9, 0 already contains values).
 when x=69:
h(89)=69%10=9 (Collision )
insert key 69 in hash-table in next
possible vacant location of 9 is 2
(since 0, 1 already contains values).
 when x = 78
h(78) = 78 % 10 = 8 ( Collision )
search next vacant slot in the table
which is 3 (since 0,1,2 contain values)
insert 78 at location 3.
0 49
1 58
2 69
3 78
4
5
6
7
8 18
9 89
Fig. Hash table with keys
Using linear probing
28
Disadvantage of linear probing is :
Primary Clustering problem
• As long as table is big enough, a free cell can always be
found, but the time to do so can get quite large.
• Worse, even if the table is relatively empty, blocks of
occupied cells start forming.
• This effect is known as primary clustering.
• Any key that hashes into the cluster will require several
attempts to resolve the collision, and then it will add to the
cluster.
29
 Quadratic probing is a collision resolution method that eliminates the
primary clustering problem take place in a linear probing.
 Compute: Hash value = h(x) = x % table size
 When collision occur then the quadratic probing works as follows:
(Hash value + 12)% table size,
 if there is again collision occur then there exist rehashing.
(hash value + 22)%table size
 if there is again collision occur then there exist rehashing.
(hash value = 32)% table size
 In general in ith collision
hi(x)=(hash value +i2)%size
Quadratic Probing:
30
solution:
when x=89:
h(89)=89%10=9
insert key 89 in hash-table in location 9
when x=18:
h(18)=18%10=8
insert key 18 in hash-table in location 8
when x=49:
h(49)=49%10=9 (Collision )
so use following hash function,
h1(49)=(9 + 1)%10=0
hence insert key 49 in hash-table in location 0
when x=58:
h(58)=58%10=8 (Collision )
so use following hash function,
h1(58)=(8 + 1)%10=9
again collision occur use again the following hash function ,
h2(58)=(8+ 22)%10=2
insert key 58 in hash-table in location 2
Example: Insert keys {89, 18, 49, 58, 69 78} with the hash-table size 10 using
quadratic probing.
0 49
1
2 58
3 69
4
5
6
7 78
8 18
9 89
Fig. Hash table with keys
Using quadratic probing
31
when x=69:
h(89)=69%10=9 (Collision )
so use following hash function,
h1(69)=(9 + 1)%10=0
again collision occurs use again the following hash function ,
h2(69)=(9+ 22)%10=3
insert key 69 in hash-table in location 3
when x=78:
h(78)=78%10=8 (Collision )
so use following hash function,
h1(78)=(8 + 1)%10=9 ; again collision occurs
use again the following hash function ,
h2(78)=(8+ 22)%10=2 ; again collision occurs, compute following step
h3(78)=(8+ 32)%10=7
insert key 58 in hash-table in location 7
• Although quadratic probing eliminates primary clustering, elements that hash
to the same location will probe the same alternative cells. This is know as
secondary clustering.
• In above example: for keys 58 and 78 both follow the path 8, 9, 7 …
• Techniques that eliminate secondary clustering are available,
the most popular is double hashing.
Quadratic Probing Problem
32
Double Hashing
 To eliminate both types of clustering one way is double hashing.
 It involves two hash functions, h1(x) and h2(x),
where h1(x) is primary hash function, is first used to determine position of
key and if it is occupied h2(x) is used.
 Example: h1(x) = x % TABLESIZE
h2(x) = R – (x % R), Where R is prime less than table size
hi(x) = h1 (x) + i.h2(x) ) % TABLESIZE
 Example: Insert keys {89, 18, 49, 58, 69 78} with the hash-
table size 10 using double hashing.
solution:
 when x=89:
h(89)=89%10=9
insert key 89 in hash-table in location 9
 when x=18:
h(18)=18%10=8
insert key 18 in hash-table in location 8
33
 when x=49:
h(49)=49%10=9 (Collision )
so use following hash function,
h1(49)=(9 + 1(7- 49%7))%10
= (9 + (7-0) ) % 10
= 6
hence insert key 49 in hash-table in location 6.
 when x=58:
h(58)=58%10=8 (Collision )
so use following hash function,
h1(58) = (8 + 1(7-(58%7))%10
=(8 + (7-2))% 10
=3
INSERT 58 in the location 3.
Compute the location for keys:
 69
 78
0
1
2
3 58
4
5
6 49
7
8 18
9 89
Fig. Hash table with keys
Using double hashing
Limitation:
It takes extra time to compute hash function.
34

More Related Content

What's hot

Hashing Algorithm
Hashing AlgorithmHashing Algorithm
Hashing AlgorithmHayi Nukman
 
Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Hashing notes data structures (HASHING AND HASH FUNCTIONS)Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Hashing notes data structures (HASHING AND HASH FUNCTIONS)Kuntal Bhowmick
 
Hash table in data structure and algorithm
Hash table in data structure and algorithmHash table in data structure and algorithm
Hash table in data structure and algorithmAamir Sohail
 
Open addressing &amp rehashing,extendable hashing
Open addressing &amp rehashing,extendable hashingOpen addressing &amp rehashing,extendable hashing
Open addressing &amp rehashing,extendable hashingHaripritha
 
Data Structure and Algorithms Hashing
Data Structure and Algorithms HashingData Structure and Algorithms Hashing
Data Structure and Algorithms HashingManishPrajapati78
 
358 33 powerpoint-slides_15-hashing-collision_chapter-15
358 33 powerpoint-slides_15-hashing-collision_chapter-15358 33 powerpoint-slides_15-hashing-collision_chapter-15
358 33 powerpoint-slides_15-hashing-collision_chapter-15sumitbardhan
 
Open Addressing on Hash Tables
Open Addressing on Hash Tables Open Addressing on Hash Tables
Open Addressing on Hash Tables Nifras Ismail
 

What's hot (20)

Hashing
HashingHashing
Hashing
 
Hashing
HashingHashing
Hashing
 
Hashing
HashingHashing
Hashing
 
Hashing
HashingHashing
Hashing
 
Hashing data
Hashing dataHashing data
Hashing data
 
Hashing Algorithm
Hashing AlgorithmHashing Algorithm
Hashing Algorithm
 
linear probing
linear probinglinear probing
linear probing
 
Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Hashing notes data structures (HASHING AND HASH FUNCTIONS)Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Hashing notes data structures (HASHING AND HASH FUNCTIONS)
 
Hash table in data structure and algorithm
Hash table in data structure and algorithmHash table in data structure and algorithm
Hash table in data structure and algorithm
 
08 Hash Tables
08 Hash Tables08 Hash Tables
08 Hash Tables
 
Open addressing &amp rehashing,extendable hashing
Open addressing &amp rehashing,extendable hashingOpen addressing &amp rehashing,extendable hashing
Open addressing &amp rehashing,extendable hashing
 
Hashing PPT
Hashing PPTHashing PPT
Hashing PPT
 
Data Structure and Algorithms Hashing
Data Structure and Algorithms HashingData Structure and Algorithms Hashing
Data Structure and Algorithms Hashing
 
Hashing
HashingHashing
Hashing
 
358 33 powerpoint-slides_15-hashing-collision_chapter-15
358 33 powerpoint-slides_15-hashing-collision_chapter-15358 33 powerpoint-slides_15-hashing-collision_chapter-15
358 33 powerpoint-slides_15-hashing-collision_chapter-15
 
Ch17 Hashing
Ch17 HashingCh17 Hashing
Ch17 Hashing
 
Hashing
HashingHashing
Hashing
 
Open Addressing on Hash Tables
Open Addressing on Hash Tables Open Addressing on Hash Tables
Open Addressing on Hash Tables
 
Hashing
HashingHashing
Hashing
 
Quadratic probing
Quadratic probingQuadratic probing
Quadratic probing
 

Similar to Unit viii searching and hashing

DS Unit 1.pptx
DS Unit 1.pptxDS Unit 1.pptx
DS Unit 1.pptxchin463670
 
Hashing Technique In Data Structures
Hashing Technique In Data StructuresHashing Technique In Data Structures
Hashing Technique In Data StructuresSHAKOOR AB
 
11_hashtable-1.ppt. Data structure algorithm
11_hashtable-1.ppt. Data structure algorithm11_hashtable-1.ppt. Data structure algorithm
11_hashtable-1.ppt. Data structure algorithmfarhankhan89766
 
searching techniques.pptx
searching techniques.pptxsearching techniques.pptx
searching techniques.pptxDr.Shweta
 
Hashing techniques, Hashing function,Collision detection techniques
Hashing techniques, Hashing function,Collision detection techniquesHashing techniques, Hashing function,Collision detection techniques
Hashing techniques, Hashing function,Collision detection techniquesssuserec8a711
 
hashing in data strutures advanced in languae java
hashing in data strutures advanced in languae javahashing in data strutures advanced in languae java
hashing in data strutures advanced in languae javaishasharma835109
 
Presentation.pptx
Presentation.pptxPresentation.pptx
Presentation.pptxAgonySingh
 
Analysis Of Algorithms - Hashing
Analysis Of Algorithms - HashingAnalysis Of Algorithms - Hashing
Analysis Of Algorithms - HashingSam Light
 
Sorting and hashing concepts
Sorting and hashing conceptsSorting and hashing concepts
Sorting and hashing conceptsLJ Projects
 
Sorting and hashing concepts
Sorting and hashing conceptsSorting and hashing concepts
Sorting and hashing conceptsLJ Projects
 
Searching and Sorting Algorithms in Data Structures
Searching and Sorting Algorithms  in Data StructuresSearching and Sorting Algorithms  in Data Structures
Searching and Sorting Algorithms in Data Structurespoongothai11
 
data structures and algorithms Unit 3
data structures and algorithms Unit 3data structures and algorithms Unit 3
data structures and algorithms Unit 3infanciaj
 

Similar to Unit viii searching and hashing (20)

Unit 8 searching and hashing
Unit   8 searching and hashingUnit   8 searching and hashing
Unit 8 searching and hashing
 
DS Unit 1.pptx
DS Unit 1.pptxDS Unit 1.pptx
DS Unit 1.pptx
 
Hashing Technique In Data Structures
Hashing Technique In Data StructuresHashing Technique In Data Structures
Hashing Technique In Data Structures
 
11_hashtable-1.ppt. Data structure algorithm
11_hashtable-1.ppt. Data structure algorithm11_hashtable-1.ppt. Data structure algorithm
11_hashtable-1.ppt. Data structure algorithm
 
Hashing .pptx
Hashing .pptxHashing .pptx
Hashing .pptx
 
Hash function
Hash functionHash function
Hash function
 
Searching
SearchingSearching
Searching
 
searching techniques.pptx
searching techniques.pptxsearching techniques.pptx
searching techniques.pptx
 
Hashing techniques, Hashing function,Collision detection techniques
Hashing techniques, Hashing function,Collision detection techniquesHashing techniques, Hashing function,Collision detection techniques
Hashing techniques, Hashing function,Collision detection techniques
 
hashing in data strutures advanced in languae java
hashing in data strutures advanced in languae javahashing in data strutures advanced in languae java
hashing in data strutures advanced in languae java
 
LECT 10, 11-DSALGO(Hashing).pdf
LECT 10, 11-DSALGO(Hashing).pdfLECT 10, 11-DSALGO(Hashing).pdf
LECT 10, 11-DSALGO(Hashing).pdf
 
Presentation.pptx
Presentation.pptxPresentation.pptx
Presentation.pptx
 
Analysis Of Algorithms - Hashing
Analysis Of Algorithms - HashingAnalysis Of Algorithms - Hashing
Analysis Of Algorithms - Hashing
 
Sorting and hashing concepts
Sorting and hashing conceptsSorting and hashing concepts
Sorting and hashing concepts
 
Sorting and hashing concepts
Sorting and hashing conceptsSorting and hashing concepts
Sorting and hashing concepts
 
Lec4
Lec4Lec4
Lec4
 
Searching and Sorting Algorithms in Data Structures
Searching and Sorting Algorithms  in Data StructuresSearching and Sorting Algorithms  in Data Structures
Searching and Sorting Algorithms in Data Structures
 
9780324782011_PPT_ch09.ppt
9780324782011_PPT_ch09.ppt9780324782011_PPT_ch09.ppt
9780324782011_PPT_ch09.ppt
 
data structures and algorithms Unit 3
data structures and algorithms Unit 3data structures and algorithms Unit 3
data structures and algorithms Unit 3
 
Hash tables
Hash tablesHash tables
Hash tables
 

Recently uploaded

SAMPLING.pptx for analystical chemistry sample techniques
SAMPLING.pptx for analystical chemistry sample techniquesSAMPLING.pptx for analystical chemistry sample techniques
SAMPLING.pptx for analystical chemistry sample techniquesrodneykiptoo8
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rockskumarmathi863
 
A Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on EarthA Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on EarthSérgio Sacani
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard Gill
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsAreesha Ahmad
 
Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...Sérgio Sacani
 
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Sérgio Sacani
 
Hemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. MuralinathHemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. Muralinathmuralinath2
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
 
Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureSérgio Sacani
 
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Sérgio Sacani
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsYOGESH DOGRA
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionpablovgd
 
Seminar on Halal AGriculture and Fisheries.pptx
Seminar on Halal AGriculture and Fisheries.pptxSeminar on Halal AGriculture and Fisheries.pptx
Seminar on Halal AGriculture and Fisheries.pptxRUDYLUMAPINET2
 
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Sérgio Sacani
 
GBSN - Microbiology (Lab 1) Microbiology Lab Safety Procedures
GBSN -  Microbiology (Lab  1) Microbiology Lab Safety ProceduresGBSN -  Microbiology (Lab  1) Microbiology Lab Safety Procedures
GBSN - Microbiology (Lab 1) Microbiology Lab Safety ProceduresAreesha Ahmad
 
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...Subhajit Sahu
 
Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Sérgio Sacani
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGAADYARAJPANDEY1
 
GBSN - Microbiology (Lab 2) Compound Microscope
GBSN - Microbiology (Lab 2) Compound MicroscopeGBSN - Microbiology (Lab 2) Compound Microscope
GBSN - Microbiology (Lab 2) Compound MicroscopeAreesha Ahmad
 

Recently uploaded (20)

SAMPLING.pptx for analystical chemistry sample techniques
SAMPLING.pptx for analystical chemistry sample techniquesSAMPLING.pptx for analystical chemistry sample techniques
SAMPLING.pptx for analystical chemistry sample techniques
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
 
A Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on EarthA Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on Earth
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...
 
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
 
Hemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. MuralinathHemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. Muralinath
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a Technosignature
 
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
Seminar on Halal AGriculture and Fisheries.pptx
Seminar on Halal AGriculture and Fisheries.pptxSeminar on Halal AGriculture and Fisheries.pptx
Seminar on Halal AGriculture and Fisheries.pptx
 
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
 
GBSN - Microbiology (Lab 1) Microbiology Lab Safety Procedures
GBSN -  Microbiology (Lab  1) Microbiology Lab Safety ProceduresGBSN -  Microbiology (Lab  1) Microbiology Lab Safety Procedures
GBSN - Microbiology (Lab 1) Microbiology Lab Safety Procedures
 
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
 
Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
 
GBSN - Microbiology (Lab 2) Compound Microscope
GBSN - Microbiology (Lab 2) Compound MicroscopeGBSN - Microbiology (Lab 2) Compound Microscope
GBSN - Microbiology (Lab 2) Compound Microscope
 

Unit viii searching and hashing

  • 1. Unit – VIII Searching and Hashing Prepared By: Dabbal Singh Mahara 1
  • 2. 2 • Introduction • Sequential search • Binary search • Comparison and efficiency of searching • Hashing  probing (Linear and Quadratic) Contents
  • 3. 3 Introduction • Searching is a process of finding an element within the list of elements stored in any order or randomly. • Searching is divided into two categories Linear and Binary search. • Linear search  Small arrays  Unsorted arrays • Binary search  Large arrays  Sorted arrays
  • 4. 4 • In linear search, access each element of an array one by one sequentially and see whether it is desired element or not. A search will be unsuccessful if all the elements are accessed and the desired element is not found. • In brief, Simply search for the given element left to right and return the index of the element, if found. Otherwise return “Not Found”. Algorithm: LinearSearch(A, n,key) { for(i=0;i<n;i++) { if(A[i] == key) return i; } return -1; //-1 indicates unsuccessful search } Analysis: Time complexity = O(n) Linear Search
  • 5. 5 Binary Search • Binary search is an extremely efficient algorithm. • This search technique searches the given item in minimum possible comparisons. • To do this binary search, first we need to sort the a elements. The logic behind this technique is given below: i. First find the middle element of the array ii. Compare the middle element with an item. iii. There are three cases: a. If it is a desired element then search is successful b. If it is less than desired item then search only the first half of the array. c. If it is greater than the desired element, search in the second half of the array. • Repeat the same process until element is found or exhausts in the search area. • In this algorithm every time we are reducing the search area.
  • 6. 6 Iterative Algorithm BinarySearch(A, l, r, key) { while(l<=r) { m = (l + r) /2 ; //integer division if(key = = A[m] print " Search successful" else if (key < A[m]) r = m - 1 else l = m+1 } If(l>r) print "unsuccessful search" }
  • 7. 7 Algorithm: Recursive BinarySearch(A, l, r, key) { if(l= = r) //only one element { if(key = = A[l]) print " successful Search" else print "unsuccessful Search" } else { m = (l + r) /2 ; //integer division if(key = = A[m] print "successful search" else if (key < A[m]) return BinarySearch(l, m-1, key) ; else return BinarySearch(m+1, r, key) ; } }
  • 8. 8 Binary Search Example • Ex. Binary search for 33. 641413 25 33 5143 53 8472 93 95 97966 low high Example Tracing
  • 9. 9 Running example: Take input array a[] For Search key = 2 l r mid remarks 0 13 6 Key < a[6] i.e. 2 < 53 0 5 2 Key < a[2] i.e. 2 < 7 0 1 0 Key == a[0] i.e. 2 ==a[0] Therefore, key found at index 0. Search Successful !! 2 5 7 9 18 45 53 59 67 72 88 95 101 104 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Exercise : Trace binary search algorithm for keys: i. 67 ii. 50 iii. 250
  • 10. 10 Search for key = 67 l r mid Remarks 0 13 6 Key < a[6] i.e. 67 > 53 7 13 10 Key < a[10] i.e. 67 < 88 7 9 8 Key == a[8] i.e. 67 ==a[8] Therefore, key found at index 8. Search Successful !! 2 5 7 9 18 45 53 59 67 72 88 95 101 104 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Input Array : a[ ]
  • 11. 11 Search for key = 50 l r mid Remarks 0 13 6 Key < a[6] i.e. 50 < 53 0 5 2 Key < a[2] i.e. 50 > 7 3 5 4 Key > a[4] i.e. 50 >18 5 5 5 Key > a[5] i.e. 50 > 45 6 5 l > r, terminate Therefore, key not found in the array. Search Unsuccessful !! 2 5 7 9 18 45 53 59 67 72 88 95 101 104 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Given Input Array a[]
  • 12. 12 Efficiency: From the above algorithm we can say that the running time of the algorithm is: T(n) = T(n/2) + Ο(1) = Ο(log n)  In the best case output is obtained at one run i.e. Ο(1) time if the key is at middle.  In the worst case the output is at the end of the array, So running time is Ο(log n)  In the average case also running time is Ο(log n).
  • 13. 13 Introduction to Hashing • Suppose that we want to store 10,000 students records (each with a 5-digit ID) in a given container.  A linked list implementation would take O(n) time.  A height balanced tree would give O(log n) access time.  Using an array of size 100,000 would give O(1) access time but will lead to a lot of space wastage. • Is there some way that we could get O(1) access without wasting a lot of space? • The answer is hashing.
  • 14. 14 Introduction to Hashing  Hashing is a technique used for performing insertions, deletions and finds in constant average time O(1).  The techniques employed here is to compute location of desired record to retrieve it in a single access or comparison.  This data structure, however, is not efficient in operations that require any ordering information among the elements, such as findMin, findMax and printing the entire table in sorted order. Applications • Database systems • Symbol table for compilers • Data Dictionaries • Browser caches
  • 15. 15  The ideal hash table structure is an array of some fixed size, containing the items.  A stored item needs to have a data member, called key, that will be used in computing the index value for the item. • Key could be an integer, a string, etc e.g. a name or Id that is a part of a large employee structure  The size of the array is TableSize.  The items that are stored in the hash table are indexed by values from 0 to TableSize – 1.  Each key is mapped into some number in the range 0 to TableSize – 1.  The mapping is called a hash function. Hash Table
  • 16. 16 Example Hash Function mary 28200 dave 27500 phil 31250 john 25000 Items Hash Table key key 0 1 2 3 4 5 6 7 8 9 mary 28200 dave 27500 phil 31250 john 25000
  • 17. 17 Hash Functions (cont’d) • A hash function, h, is a function which transforms a key from a set, K, into an index in a table of size n: h: K -> {0, 1, ..., n-2, n-1} • A key can be a number, a string, a record etc. • The size of the set of keys, |K|, to be relatively very large. • It is possible for different keys to hash to the same array location. This situation is called collision and the colliding keys are called synonyms. • A common hash function is h(x)=x mod SIZE • if key=27 and SIZE=10 then hash address=27%10=7
  • 18. 18 • A good hash function should: · Minimize collisions. · Be easy and quick to compute. · Distribute key values evenly in the hash table. · Use all the information provided in the key.
  • 19. 19 Load Factor of a Hash Table • Load factor of a hash table T:  = n/m – n = number of elements stored in the table – m = number of slots in the table •  encodes the average number of elements stored in a chain •  can be <, =, > 1 0 m - 1 T chain chain chain chain
  • 20. 20 Collision Resolution • If, when an element is inserted, it hashes to the same value as an already inserted element, then we have a collision and need to resolve it. i.e. For any two keys k1 and k2, H(k1) = H(k2) = β • There are several methods for dealing with this: – Separate chaining – Open addressing • Linear Probing • Quadratic Probing • Double Hashing
  • 21. 21 Separate Chaining • The idea is to keep a list of all elements that hash to the same value. – The array elements are pointers to the first nodes of the lists. – A new item is inserted to the front of the list. • Advantages: – Better space utilization for large items. – Simple collision handling: searching linked list. – Overflow: we can store more items than the hash table size. – Deletion is quick and easy: deletion from the linked list.
  • 22. 22 Example 0 1 2 3 4 5 6 7 8 9 0 81 1 64 4 25 36 16 49 9 Keys: 0, 1, 4, 9, 16, 25, 36, 49, 64, 81 hash(key) = key % 10. Exercise: Represent the keys {89, 18, 49, 58, 69, 78} in hash table using separate chaining.
  • 23. 23 Operations • Initialization: all entries are set to NULL • Find: – locate the cell using hash function. – sequential search on the linked list in that cell. • Insertion: – Locate the cell using hash function. – (If the item does not exist) insert it as the first item in the list. • Deletion: – Locate the cell using hash function. – Delete the item from the linked list.
  • 24. 24 Collision Resolution with Open Addressing • Separate chaining has the disadvantage of using linked lists. – Requires the implementation of a second data structure. • In an open addressing hashing system, all the data go inside the table. – Thus, a bigger table is needed. • Generally the load factor should be below 0.5. – If a collision occurs, alternative cells are tried until an empty cell is found.
  • 25. 25 Open Addressing • More formally: – Cells h0(x), h1(x), h2(x), …are tried in succession where, hi(x) = (hash(x) + f(i)) mod TableSize, with f(0) = 0. – The function f is the collision resolution strategy. • There are three common collision resolution strategies: – Linear Probing – Quadratic probing – Double hashing
  • 26. 26 Linear Probing • In linear probing, collisions are resolved by sequentially scanning an array (with wraparound) until an empty cell is found. • hi(x) = (hash(x) + f(i)) mod TableSize – i.e. f is a linear function of i, typically f(i)= i.  Example: Insert keys {89, 18, 49, 58, 69, 78} with the hash function: h(x)=x mod 10 using linear probing. Use table size 10.  when x=89: h(89)=89%10=9 insert key 89 in hash-table in location 9  when x=18: h(18)=18%10=8 insert key 18 in hash-table in location 8
  • 27. 27 when x=49: h(49)=49%10=9 (Collision ) so insert key 49 in hash-table in next possible vacant location of 9 is 0  when x=58: h(58)=58%10=8 (Collision) insert key 58 in hash-table in next possible vacant location of 8 is 1 (since 9, 0 already contains values).  when x=69: h(89)=69%10=9 (Collision ) insert key 69 in hash-table in next possible vacant location of 9 is 2 (since 0, 1 already contains values).  when x = 78 h(78) = 78 % 10 = 8 ( Collision ) search next vacant slot in the table which is 3 (since 0,1,2 contain values) insert 78 at location 3. 0 49 1 58 2 69 3 78 4 5 6 7 8 18 9 89 Fig. Hash table with keys Using linear probing
  • 28. 28 Disadvantage of linear probing is : Primary Clustering problem • As long as table is big enough, a free cell can always be found, but the time to do so can get quite large. • Worse, even if the table is relatively empty, blocks of occupied cells start forming. • This effect is known as primary clustering. • Any key that hashes into the cluster will require several attempts to resolve the collision, and then it will add to the cluster.
  • 29. 29  Quadratic probing is a collision resolution method that eliminates the primary clustering problem take place in a linear probing.  Compute: Hash value = h(x) = x % table size  When collision occur then the quadratic probing works as follows: (Hash value + 12)% table size,  if there is again collision occur then there exist rehashing. (hash value + 22)%table size  if there is again collision occur then there exist rehashing. (hash value = 32)% table size  In general in ith collision hi(x)=(hash value +i2)%size Quadratic Probing:
  • 30. 30 solution: when x=89: h(89)=89%10=9 insert key 89 in hash-table in location 9 when x=18: h(18)=18%10=8 insert key 18 in hash-table in location 8 when x=49: h(49)=49%10=9 (Collision ) so use following hash function, h1(49)=(9 + 1)%10=0 hence insert key 49 in hash-table in location 0 when x=58: h(58)=58%10=8 (Collision ) so use following hash function, h1(58)=(8 + 1)%10=9 again collision occur use again the following hash function , h2(58)=(8+ 22)%10=2 insert key 58 in hash-table in location 2 Example: Insert keys {89, 18, 49, 58, 69 78} with the hash-table size 10 using quadratic probing. 0 49 1 2 58 3 69 4 5 6 7 78 8 18 9 89 Fig. Hash table with keys Using quadratic probing
  • 31. 31 when x=69: h(89)=69%10=9 (Collision ) so use following hash function, h1(69)=(9 + 1)%10=0 again collision occurs use again the following hash function , h2(69)=(9+ 22)%10=3 insert key 69 in hash-table in location 3 when x=78: h(78)=78%10=8 (Collision ) so use following hash function, h1(78)=(8 + 1)%10=9 ; again collision occurs use again the following hash function , h2(78)=(8+ 22)%10=2 ; again collision occurs, compute following step h3(78)=(8+ 32)%10=7 insert key 58 in hash-table in location 7 • Although quadratic probing eliminates primary clustering, elements that hash to the same location will probe the same alternative cells. This is know as secondary clustering. • In above example: for keys 58 and 78 both follow the path 8, 9, 7 … • Techniques that eliminate secondary clustering are available, the most popular is double hashing. Quadratic Probing Problem
  • 32. 32 Double Hashing  To eliminate both types of clustering one way is double hashing.  It involves two hash functions, h1(x) and h2(x), where h1(x) is primary hash function, is first used to determine position of key and if it is occupied h2(x) is used.  Example: h1(x) = x % TABLESIZE h2(x) = R – (x % R), Where R is prime less than table size hi(x) = h1 (x) + i.h2(x) ) % TABLESIZE  Example: Insert keys {89, 18, 49, 58, 69 78} with the hash- table size 10 using double hashing. solution:  when x=89: h(89)=89%10=9 insert key 89 in hash-table in location 9  when x=18: h(18)=18%10=8 insert key 18 in hash-table in location 8
  • 33. 33  when x=49: h(49)=49%10=9 (Collision ) so use following hash function, h1(49)=(9 + 1(7- 49%7))%10 = (9 + (7-0) ) % 10 = 6 hence insert key 49 in hash-table in location 6.  when x=58: h(58)=58%10=8 (Collision ) so use following hash function, h1(58) = (8 + 1(7-(58%7))%10 =(8 + (7-2))% 10 =3 INSERT 58 in the location 3. Compute the location for keys:  69  78 0 1 2 3 58 4 5 6 49 7 8 18 9 89 Fig. Hash table with keys Using double hashing Limitation: It takes extra time to compute hash function.
  • 34. 34