SlideShare a Scribd company logo
-V.Dinesh
III/IV CSE-B
ANITS
Hashing
Outline
 11.1 Introduction
– 11.1.1 What is Hashing
– 11.1.2 Collisions
 11.2 A Simple Hashing Algorithm
 11.3 Hashing Functions and Record Distributions
– 11.3.1 Distributing Records among Addressing
– 11.3.2 Some Other Hashing Methods
– 11.3.3 Predicting the Distribution of Records
– 11.3.4Predicting Collisions for a Full File
11.1.1 What is Hashing
 Hashing is the transformation of a string of characters into a
usually shorter fixed-length value that represents the original
string.
 Hashing is used to index and retrieve items in a database.
 It is so fast that it will take O(1) time to search an element.
How hashing is done?
 It uses Hash Function.
– Takes Key (K) as an argument.
– Return Address (home address).
 Hash Table
– It is a datastructure similar to array.
– The key is placed in the home address of hash table.
Hash Function
int MyHash(char *key)
{
int d;
d=(key[0]*key[1])%1000;
return d;
}
Searching , Insertion and Deletion
Let us consider a hash table H[1000].
 Inserting :
– H[MyHash(key)]=key;
 Deleting:
– H[MyHash(key)]=‘0’;
 Searching:
– H[MyHash(key)];
Insertion
Key=“DINESH”
MyHash(Key)
60
61
62
63
64
65
66
DINESH
Name Home Address
DINESH 64
BALL 290
TREE 888
11.1.2 Collisions
 Now consider the Key “IDIOT”. The home address for this key is
64 which is same as home address of “DINESH”.
 The fighting between two different keys for the same address is
called collision.
 Keys those fight for the same address is called as synonyms.
– E.g. Here the keys “DINESH” and “IDIOT” are synonyms.
Collisions (Contd..)
 They cause many problems because we cannot insert more
than one key in one address.
 We should design an algorithm which will not give any
collisions.
 That kind of algorithm is called perfect hashing algorithm.
Practically this kind of algorithm is hard to achieve.
Avoidance of collisions
 This can be done in 3 ways.
– Spread out the records
– Use extra memory
– Put more than one record at a single address
11.2 A Simple Hashing Algorithm
 It consists of three steps.
– Step 1 : Represent the key in numerical form.
– Step 2 : Fold and add
– Step 3 : Divide by the size of the address space.
Represent the address in a Numerical Form
 If the key is already a number, we can skip this process.
 If it is a string consider the ASCII values of each character.
– Let us consider the key = “LOWELL”.
L O W E L L
76 79 87 69 76 76 32 32 32 32 32 32
Blank Spaces
2.Fold and Add
 It means chopping off pieces of the number and adding them
together.
L O W E L L
7679 | 8769 | 7676 | 3232 | 3232 | 3232
This process is chopping. We have to add these chopped
numbers in next substep.
Fold and add (contd..)
 While adding we have to check whether the sum is going
beyond the range of datatype.
 Let the range be 32767 (range of int in 16 bit compiler).We must
be sure that sum should not cross this range.
 So divide the sum in each iteration with prime number like
19937 (Why???).
Adding…
L O W E L L
7679 | 8769 | 7676 | 3232 | 3232 | 3232
7679+8769=16448  16448%19937=16448
16448+7676=24124  24124%19937=4187
4187+3232=10651  10651%19937=10651
10651+3232=13383  13383%19937=13383
Finally, Sum=13383.
Divide the size of address space
 a=s mod n
– Where a=home address
– s=sum in step 2
– n=number of addresses in a file.
 Since n is addresses in a file and can be very large. So choose
the prime closer to n.
Hash Function.
int Hash(char key[12],int maxAddress)
{
int sum=0;
for(int j=0;j<12;j+=2)
sum=(sum+100*key[j]+key[j+1])%19937;
return sum%maxAddress;
}
11.3.1 Distributing Records among Addresses
A
B
C
D
E
F
G
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
A
B
C
D
E
F
G
A
B
C
D
E
F
G
Uniform distribution All Synonyms A few Synonyms
11.3.2 Some other Hashing Methods
 Examine Keys for a pattern
 Fold parts of the key
 Divide the key by a number
 Square the key and take the middle
 Radix transformation
11.3.3 Predicting the Distribution of Records
 It is hard to tell the distribution of the records but we can predict
the distribution of the records.
 One of the prediction methods is Poission Distribution.
– p(x no of B’s and (r-x) no of A’s)=a^(r-x).b^x
– And the number of ways that x no of B’s and r-x no of A’s can be
arranged is C= r !
(r-x) ! * x !
 By rewriting the a as (1-1/N )and b as 1/N the poission
distribution is changed to
p(x)= (r/N)^x * e^(-r/N)
x!
In general if there are N addresses, then the expected number of
addresses with x records assigned to them is
N*p(x)
11.3.4 Predicting Collisions for a Full File
 Let a file contains 10000 records in 10000 addresses.
 Here , r=10000 and N=10000 then r/N=1
Substituting in p(x)= (r/N)^x * e^(-r/N)
p(0)= 1^0 * e^-1 = 0.3679
The number of addresses with no records assigned is
N*p(x)=10000*0.3679=3679
x!x!
0!
 Similiarly , the no of addresses one, two, and three records
assigned respectively are,
 10000*p(1)=3679
 10000*p(2)=1839
 10000*p(1)=613
 So, there will be 1839 overflows in two addresses,2*613
addresses in 3 addresses.

More Related Content

What's hot

Hashing data
Hashing dataHashing data
Hashing data
umair khan
 
18 hashing
18 hashing18 hashing
18 hashing
deonnash
 
Hashing
HashingHashing
Hashing
debolina13
 
4.4 hashing
4.4 hashing4.4 hashing
4.4 hashing
Krish_ver2
 
Hashing in datastructure
Hashing in datastructureHashing in datastructure
Hashing in datastructure
rajshreemuthiah
 
Hashing Algorithm
Hashing AlgorithmHashing Algorithm
Hashing Algorithm
Hayi Nukman
 
Concept of hashing
Concept of hashingConcept of hashing
Concept of hashingRafi Dar
 
Hashing
HashingHashing
Rehashing
RehashingRehashing
Rehashing
rajshreemuthiah
 
Hashing
HashingHashing
Hashing
kurubameena1
 
Quadratic probing
Quadratic probingQuadratic probing
Quadratic probing
rajshreemuthiah
 
Hash Tables in data Structure
Hash Tables in data StructureHash Tables in data Structure
Hash Tables in data Structure
Prof Ansari
 
08 Hash Tables
08 Hash Tables08 Hash Tables
08 Hash Tables
Andres Mendez-Vazquez
 
linear probing
linear probinglinear probing
linear probing
rajshreemuthiah
 
Chapter 12 ds
Chapter 12 dsChapter 12 ds
Chapter 12 ds
Hanif Durad
 

What's hot (20)

Hashing data
Hashing dataHashing data
Hashing data
 
Hashing
HashingHashing
Hashing
 
18 hashing
18 hashing18 hashing
18 hashing
 
Hashing
HashingHashing
Hashing
 
4.4 hashing
4.4 hashing4.4 hashing
4.4 hashing
 
Hash tables
Hash tablesHash tables
Hash tables
 
Hashing
HashingHashing
Hashing
 
Ds 8
Ds 8Ds 8
Ds 8
 
Hashing in datastructure
Hashing in datastructureHashing in datastructure
Hashing in datastructure
 
Hashing Algorithm
Hashing AlgorithmHashing Algorithm
Hashing Algorithm
 
Concept of hashing
Concept of hashingConcept of hashing
Concept of hashing
 
Hashing
HashingHashing
Hashing
 
Rehashing
RehashingRehashing
Rehashing
 
Hashing
HashingHashing
Hashing
 
Hashing
HashingHashing
Hashing
 
Quadratic probing
Quadratic probingQuadratic probing
Quadratic probing
 
Hash Tables in data Structure
Hash Tables in data StructureHash Tables in data Structure
Hash Tables in data Structure
 
08 Hash Tables
08 Hash Tables08 Hash Tables
08 Hash Tables
 
linear probing
linear probinglinear probing
linear probing
 
Chapter 12 ds
Chapter 12 dsChapter 12 ds
Chapter 12 ds
 

Viewers also liked

Performance #4 network
Performance #4  networkPerformance #4  network
Performance #4 network
Vitali Pekelis
 
Performence #2 gpu
Performence #2  gpuPerformence #2  gpu
Performence #2 gpu
Vitali Pekelis
 
Clone Detection for Graph-Based Model Transformation Languages
Clone Detection for Graph-Based Model Transformation LanguagesClone Detection for Graph-Based Model Transformation Languages
Clone Detection for Graph-Based Model Transformation Languages
Daniel G. Strüber
 
Contribuciones especiales
Contribuciones especialesContribuciones especiales
Contribuciones especiales
NeniTah Carrillo
 
Application of bases
Application of basesApplication of bases
Application of bases
Abdur Rehman
 
Bosque seco
Bosque secoBosque seco
Bosque seco
yucetecom
 
Laws in disceret
Laws in disceretLaws in disceret
Laws in disceret
Abdur Rehman
 
Peer Editing Technique for Teaching Writing
Peer Editing Technique for Teaching WritingPeer Editing Technique for Teaching Writing
Peer Editing Technique for Teaching Writing
Ana Fau
 
New York times Paywall case study
New York times Paywall case study New York times Paywall case study
New York times Paywall case study
amritpal kaur
 
Багш ажлын байрандаа тасралтгүй хөгжих нь
Багш ажлын байрандаа тасралтгүй хөгжих ньБагш ажлын байрандаа тасралтгүй хөгжих нь
Багш ажлын байрандаа тасралтгүй хөгжих нь
Сэтгэмж Цогцолбор Сургууль
 

Viewers also liked (11)

Performance #4 network
Performance #4  networkPerformance #4  network
Performance #4 network
 
Consejo comunal
Consejo comunalConsejo comunal
Consejo comunal
 
Performence #2 gpu
Performence #2  gpuPerformence #2  gpu
Performence #2 gpu
 
Clone Detection for Graph-Based Model Transformation Languages
Clone Detection for Graph-Based Model Transformation LanguagesClone Detection for Graph-Based Model Transformation Languages
Clone Detection for Graph-Based Model Transformation Languages
 
Contribuciones especiales
Contribuciones especialesContribuciones especiales
Contribuciones especiales
 
Application of bases
Application of basesApplication of bases
Application of bases
 
Bosque seco
Bosque secoBosque seco
Bosque seco
 
Laws in disceret
Laws in disceretLaws in disceret
Laws in disceret
 
Peer Editing Technique for Teaching Writing
Peer Editing Technique for Teaching WritingPeer Editing Technique for Teaching Writing
Peer Editing Technique for Teaching Writing
 
New York times Paywall case study
New York times Paywall case study New York times Paywall case study
New York times Paywall case study
 
Багш ажлын байрандаа тасралтгүй хөгжих нь
Багш ажлын байрандаа тасралтгүй хөгжих ньБагш ажлын байрандаа тасралтгүй хөгжих нь
Багш ажлын байрандаа тасралтгүй хөгжих нь
 

Similar to Hashing

Hashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdfHashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdf
JaithoonBibi
 
Hashing .pptx
Hashing .pptxHashing .pptx
Hashing .pptx
ParagAhir1
 
Sienna 9 hashing
Sienna 9 hashingSienna 9 hashing
Sienna 9 hashingchidabdu
 
presentation on important DAG,TRIE,Hashing.pptx
presentation on important DAG,TRIE,Hashing.pptxpresentation on important DAG,TRIE,Hashing.pptx
presentation on important DAG,TRIE,Hashing.pptx
jainaaru59
 
Data structure Unit-I Part-C
Data structure Unit-I Part-CData structure Unit-I Part-C
Data structure Unit-I Part-C
SSN College of Engineering, Kalavakkam
 
Presentation.pptx
Presentation.pptxPresentation.pptx
Presentation.pptx
AgonySingh
 
Hashing.pptx
Hashing.pptxHashing.pptx
Hashing.pptx
kratika64
 
2018a 1324654jhjkhkhkkjhk
2018a 1324654jhjkhkhkkjhk2018a 1324654jhjkhkhkkjhk
2018a 1324654jhjkhkhkkjhk
Jasser Kouki
 
Algo-Exercises-2-hash-AVL-Tree.ppt
Algo-Exercises-2-hash-AVL-Tree.pptAlgo-Exercises-2-hash-AVL-Tree.ppt
Algo-Exercises-2-hash-AVL-Tree.ppt
HebaSamy22
 
lecture10.ppt
lecture10.pptlecture10.ppt
lecture10.ppt
ShaistaRiaz4
 
hashing.pdf
hashing.pdfhashing.pdf
hashing.pdf
Yuvraj919347
 
HASHING IS NOT YASH IT IS HASH.pptx
HASHING IS NOT YASH IT IS HASH.pptxHASHING IS NOT YASH IT IS HASH.pptx
HASHING IS NOT YASH IT IS HASH.pptx
JITTAYASHWANTHREDDY
 
Sorting and hashing concepts
Sorting and hashing conceptsSorting and hashing concepts
Sorting and hashing concepts
LJ Projects
 
Sorting and hashing concepts
Sorting and hashing conceptsSorting and hashing concepts
Sorting and hashing concepts
LJ Projects
 
Indexing.ppt
Indexing.pptIndexing.ppt
Indexing.ppt
KalsoomTahir2
 
Indexing.ppt mmmmmmmmmmmmmmmmmmmmmmmmmmmmm
Indexing.ppt mmmmmmmmmmmmmmmmmmmmmmmmmmmmmIndexing.ppt mmmmmmmmmmmmmmmmmmmmmmmmmmmmm
Indexing.ppt mmmmmmmmmmmmmmmmmmmmmmmmmmmmm
RAtna29
 
Introduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptxIntroduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptx
PJS KUMAR
 
L21_Hashing.pdf
L21_Hashing.pdfL21_Hashing.pdf
L21_Hashing.pdf
BlessingMapadza1
 
introduction to trees,graphs,hashing
introduction to trees,graphs,hashingintroduction to trees,graphs,hashing
introduction to trees,graphs,hashing
Akhil Prem
 
Lec5
Lec5Lec5

Similar to Hashing (20)

Hashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdfHashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdf
 
Hashing .pptx
Hashing .pptxHashing .pptx
Hashing .pptx
 
Sienna 9 hashing
Sienna 9 hashingSienna 9 hashing
Sienna 9 hashing
 
presentation on important DAG,TRIE,Hashing.pptx
presentation on important DAG,TRIE,Hashing.pptxpresentation on important DAG,TRIE,Hashing.pptx
presentation on important DAG,TRIE,Hashing.pptx
 
Data structure Unit-I Part-C
Data structure Unit-I Part-CData structure Unit-I Part-C
Data structure Unit-I Part-C
 
Presentation.pptx
Presentation.pptxPresentation.pptx
Presentation.pptx
 
Hashing.pptx
Hashing.pptxHashing.pptx
Hashing.pptx
 
2018a 1324654jhjkhkhkkjhk
2018a 1324654jhjkhkhkkjhk2018a 1324654jhjkhkhkkjhk
2018a 1324654jhjkhkhkkjhk
 
Algo-Exercises-2-hash-AVL-Tree.ppt
Algo-Exercises-2-hash-AVL-Tree.pptAlgo-Exercises-2-hash-AVL-Tree.ppt
Algo-Exercises-2-hash-AVL-Tree.ppt
 
lecture10.ppt
lecture10.pptlecture10.ppt
lecture10.ppt
 
hashing.pdf
hashing.pdfhashing.pdf
hashing.pdf
 
HASHING IS NOT YASH IT IS HASH.pptx
HASHING IS NOT YASH IT IS HASH.pptxHASHING IS NOT YASH IT IS HASH.pptx
HASHING IS NOT YASH IT IS HASH.pptx
 
Sorting and hashing concepts
Sorting and hashing conceptsSorting and hashing concepts
Sorting and hashing concepts
 
Sorting and hashing concepts
Sorting and hashing conceptsSorting and hashing concepts
Sorting and hashing concepts
 
Indexing.ppt
Indexing.pptIndexing.ppt
Indexing.ppt
 
Indexing.ppt mmmmmmmmmmmmmmmmmmmmmmmmmmmmm
Indexing.ppt mmmmmmmmmmmmmmmmmmmmmmmmmmmmmIndexing.ppt mmmmmmmmmmmmmmmmmmmmmmmmmmmmm
Indexing.ppt mmmmmmmmmmmmmmmmmmmmmmmmmmmmm
 
Introduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptxIntroduction to data structures and complexity.pptx
Introduction to data structures and complexity.pptx
 
L21_Hashing.pdf
L21_Hashing.pdfL21_Hashing.pdf
L21_Hashing.pdf
 
introduction to trees,graphs,hashing
introduction to trees,graphs,hashingintroduction to trees,graphs,hashing
introduction to trees,graphs,hashing
 
Lec5
Lec5Lec5
Lec5
 

Recently uploaded

weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
SyedAbiiAzazi1
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
ClaraZara1
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
aqil azizi
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
Intella Parts
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
symbo111
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 

Recently uploaded (20)

weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 

Hashing

  • 2. Outline  11.1 Introduction – 11.1.1 What is Hashing – 11.1.2 Collisions  11.2 A Simple Hashing Algorithm  11.3 Hashing Functions and Record Distributions – 11.3.1 Distributing Records among Addressing – 11.3.2 Some Other Hashing Methods – 11.3.3 Predicting the Distribution of Records – 11.3.4Predicting Collisions for a Full File
  • 3. 11.1.1 What is Hashing  Hashing is the transformation of a string of characters into a usually shorter fixed-length value that represents the original string.  Hashing is used to index and retrieve items in a database.  It is so fast that it will take O(1) time to search an element.
  • 4. How hashing is done?  It uses Hash Function. – Takes Key (K) as an argument. – Return Address (home address).  Hash Table – It is a datastructure similar to array. – The key is placed in the home address of hash table.
  • 5. Hash Function int MyHash(char *key) { int d; d=(key[0]*key[1])%1000; return d; }
  • 6. Searching , Insertion and Deletion Let us consider a hash table H[1000].  Inserting : – H[MyHash(key)]=key;  Deleting: – H[MyHash(key)]=‘0’;  Searching: – H[MyHash(key)];
  • 8. 11.1.2 Collisions  Now consider the Key “IDIOT”. The home address for this key is 64 which is same as home address of “DINESH”.  The fighting between two different keys for the same address is called collision.  Keys those fight for the same address is called as synonyms. – E.g. Here the keys “DINESH” and “IDIOT” are synonyms.
  • 9. Collisions (Contd..)  They cause many problems because we cannot insert more than one key in one address.  We should design an algorithm which will not give any collisions.  That kind of algorithm is called perfect hashing algorithm. Practically this kind of algorithm is hard to achieve.
  • 10. Avoidance of collisions  This can be done in 3 ways. – Spread out the records – Use extra memory – Put more than one record at a single address
  • 11. 11.2 A Simple Hashing Algorithm  It consists of three steps. – Step 1 : Represent the key in numerical form. – Step 2 : Fold and add – Step 3 : Divide by the size of the address space.
  • 12. Represent the address in a Numerical Form  If the key is already a number, we can skip this process.  If it is a string consider the ASCII values of each character. – Let us consider the key = “LOWELL”. L O W E L L 76 79 87 69 76 76 32 32 32 32 32 32 Blank Spaces
  • 13. 2.Fold and Add  It means chopping off pieces of the number and adding them together. L O W E L L 7679 | 8769 | 7676 | 3232 | 3232 | 3232 This process is chopping. We have to add these chopped numbers in next substep.
  • 14. Fold and add (contd..)  While adding we have to check whether the sum is going beyond the range of datatype.  Let the range be 32767 (range of int in 16 bit compiler).We must be sure that sum should not cross this range.  So divide the sum in each iteration with prime number like 19937 (Why???).
  • 15. Adding… L O W E L L 7679 | 8769 | 7676 | 3232 | 3232 | 3232 7679+8769=16448  16448%19937=16448 16448+7676=24124  24124%19937=4187 4187+3232=10651  10651%19937=10651 10651+3232=13383  13383%19937=13383 Finally, Sum=13383.
  • 16. Divide the size of address space  a=s mod n – Where a=home address – s=sum in step 2 – n=number of addresses in a file.  Since n is addresses in a file and can be very large. So choose the prime closer to n.
  • 17. Hash Function. int Hash(char key[12],int maxAddress) { int sum=0; for(int j=0;j<12;j+=2) sum=(sum+100*key[j]+key[j+1])%19937; return sum%maxAddress; }
  • 18. 11.3.1 Distributing Records among Addresses A B C D E F G 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 A B C D E F G A B C D E F G Uniform distribution All Synonyms A few Synonyms
  • 19. 11.3.2 Some other Hashing Methods  Examine Keys for a pattern  Fold parts of the key  Divide the key by a number  Square the key and take the middle  Radix transformation
  • 20. 11.3.3 Predicting the Distribution of Records  It is hard to tell the distribution of the records but we can predict the distribution of the records.  One of the prediction methods is Poission Distribution. – p(x no of B’s and (r-x) no of A’s)=a^(r-x).b^x – And the number of ways that x no of B’s and r-x no of A’s can be arranged is C= r ! (r-x) ! * x !
  • 21.  By rewriting the a as (1-1/N )and b as 1/N the poission distribution is changed to p(x)= (r/N)^x * e^(-r/N) x! In general if there are N addresses, then the expected number of addresses with x records assigned to them is N*p(x)
  • 22. 11.3.4 Predicting Collisions for a Full File  Let a file contains 10000 records in 10000 addresses.  Here , r=10000 and N=10000 then r/N=1 Substituting in p(x)= (r/N)^x * e^(-r/N) p(0)= 1^0 * e^-1 = 0.3679 The number of addresses with no records assigned is N*p(x)=10000*0.3679=3679 x!x! 0!
  • 23.  Similiarly , the no of addresses one, two, and three records assigned respectively are,  10000*p(1)=3679  10000*p(2)=1839  10000*p(1)=613  So, there will be 1839 overflows in two addresses,2*613 addresses in 3 addresses.