SlideShare a Scribd company logo
1 of 45
Data Structure
Unit-I Part C
Hashing List Searches
Basic Concepts
 In a hashed search, the key, through an algorithmic function, determines the location of
the data.
 We use a hashing algorithm to transform the key into the index that contains the data we
need to locate.
 Another way to describe hashing is as a key-to-address transformation in which the keys
map to addresses in a list.
 Hashing is a key-to address mapping process
 The address produced by the hashing algorithm is known as the home address.
 We call the set of keys that hash to the same location in our list synonyms.
 A collision occurs when a hashing algorithm produces an address for an
insertion key and that address is already occupied.
 The address produced by the hashing algorithm is known as the home
address.
 The memory that contains all of the home addresses is known as the prime
area.
 Each calculation of an address and test for success is known as a probe.
Hashing Methods
There are eight hashing methods they are:
 Direct method
 Substraction method
 Modulo-division
 Midsquare
 Digit extraction
 Rotation
 Folding
 Pseudorandom generation
Direct Method:
 In direct hashing the key is the address without any algorithmic manipulation.
 Direct hashing is limited, but it can be very powerful because it guarantees
that there are no synonyms and therefore no collision.
Subtraction Method
 Sometimes keys are consecutive but do not start from 1.
 Example:
 A company may have only 100 employees, but the employee numbers start from
1001 and go to 1100.
 In this case we use subtraction hashing, a very simple hashing function that
subtracts 1000 from the key to determine the address.
 The direct and subtraction hash functions both guarantee a search effort of one
with no collisions.
 They are 'one-to-one hashing methods: only one key hashes to each address.
Modulo-Division/Method:
 Also known as division remainder, the modulo-division method divides the key by
the array size and uses the remainder for the address.
 This method gives us the simple hashing algorithm shown below in which listSize is
the number of elements in the array:
 Address = key MODULO listSize
 Example:
 Given data :
 Keys are : 137456 214562 140145
 137456 % 19 +1 = 11
214562 % 19 + 1 = 15
140145 % 19 + 1 = 2
Digit-Extraction Method:
 Using digit extraction selected digits are extracted from the key and used as the address.
 Example:
 Using our six-digit employee number to hash to a three digit address (000-999)
 We could select the first, third, and fourth digits (from the left) and use them as the
address.
 379452 -> 394
 121267 -> 112
 378845 -> 388
 160252 -> 102
 045128 -> 051
Mid Square Method
 In mid square hashing the key is squared and the address is selected from the
middle of the square number.
 Limitation is the size of the key.
 Example:
94522 = 89340304: address is 3403
 379452: 379 * 379 = 143641 -> 364
 121267: 121 * 121 = 014641 -> 464
 378845: 378 * 378 = 142884 -> 288
 160252: 160 * 160 = 025600 -> 560
 045128: 045 * 045 = 002025 -> 202
 The same digits must be selected from the product.
Folding Method
Two folding methods are used they are:
 Fold shift
 Fold boundary
Fold Shift
 In fold shift the key value is divided into parts whose size matches the size of the
required address.
 Then the left and right parts are shifted and added with the middle part.
Fold boundary
 In fold boundary the left and right numbers are folded on a fixed boundary between them
and the center number.
 The two outside values are thus reversed.
Rotation Method
 Rotation method is generally not used by itself but rather is incorporated in
combination with other hashing methods.
 It is most useful when keys are assigned serially.
 A simple hashing algorithm tends to create synonyms when hashing keys are
identical except for the last character.
 Rotating the last character to the front of the key minimizes this effect.
Pseudorandom method
 A common random-number generator is shown below.
y= ax + c
 To use the pseudorandom-number generator as a hashing method, we set x to the
key, multiply it by the coefficient a, and then add the constant c.
 The result is then divided by the list size, with the remainder being the hashed
address.
Example:
Y= ((17 * 121267) + 7) modulo 307
Y= (2061539 + 7) modulo 307
Y= 2061546
Y=41
Hashing algorithm
 The hashing methods may work well when we hash a key to an address in an array,
hashing to large files is generally more complex.
 We have an alphanumeric key consisting of up to 30 bytes that we need to hash into a
32-bit address.
 Step 1: To convert alphanumeric key into a number key by adding the American
Standard Code for Information Interchange (ASCII) value for each character to an
accumulator that will be the address.
 Step 2: As each character is added, we rotate the bits in the address to maximize the
distribution of the values.
 Step 3: After the characters in the key have been completely hashed, we take the
absolute value of the address and then map it into the address range for the file.
Analysis
First:
 The rotation can often be accomplished by an assembly language instruction.
 If the algorithm is written in a high-level language, then the rotation is accomplished by
a series of bitwise and instructions.
 For out purposes, it is sufficient that the 12 bits at the end of the address are shifted to
be the 12 bits at the beginning of the address and the bits at the beginning are shifted
to occupy the bit locations at the right.
Second:
 This algorithm actually uses three of the hashing methods.
 Finally, we use modulo division when we map the hashed address into the range of
available addresses.
Collision Resolution
 With the exception of the direct and subtraction methods, none of the methods
used for hashing are one-to-one mapping.
 Thus, when we hash a new key to an address, we may create a collision.
 A collision occurs when a hashing algorithm produces an address for an insertion
key and that address is already occupied.
 There are several methods for handling collisions, each of them independent of
the hashing algorithm.
Concepts
 The load factor of a hashed list is the number of elements in the list divided
by the number of physical elements allocated for the list, expressed as a
percentage.
 Traditionally, load factor is assigned the symbol alpha (α).
 The formula in which k repesents the number of filled elements in the list and
n represents the total number of elements allocated to the list is
 a = ( k / n ) * 100
Computer scientists have identified two distinct types of clusters.
 (i) Primary clustering occurs when data cluster around a home address.
Primary clustering is easy to identify.
 (ii) Secondary clustering occurs when data become grouped along a collision
throughout a list. This type of clustering is not easy to identify.
 There are two different approaches to resolving collisions:
 Open addressing
 Linked lists.
Open Addressing
 The first collision resolution method, open addressing, resolves collisions in the
prime area-that is, the area that contains all of the home addresses.
 When a collision occurs, the prime area addresses are searched for an 0 or
unoccupied element where the new data can be placed.
Linear Probe
 In a linear probe, which is the simplest, when data cannot be stored in the home
address we resolve the collision by adding 1 to the current address.
 However, this address is also filled.
 We therefore add another 1 to the address and this time find an empty location.
 Advantages:
 First: they are quite simple to implement.
 Second: data tend to remain near their home address.
Quadratic Probe
 Primary clustering, although not necessarily secondary clustering, can be
eliminated by adding a value other than 1 to the current address.
 One easily implemented method is to use the quadratic probe.
 Disadvantage:
 It is time required to square the probe number.
 We can eliminate the multiply factor, however, by using an increment factor that
increases by 2 each probe.
 Adding the increment factor to the previous increment gives us the next
increment.
 The quadratic probe has one limitation:
 It is not possible to generate a new address for every element in the list.
Pseudo random Collision Resolution
 The last two open addressing methods ( Linear Probe and Quadratic Probe) methods are
collectively known as double hashing.
 In each method, rather than using an arithmetic probe function, the address is rehashed.
 Pseudorandom collision resolution uses a pseudorandom number to resolve the collision.
 We now use it a collision resolution method. In this case, rather than use the key as a
factor in the random-number calculation, we use the collision address.
 We now resolve the collision using the following pseudorandom-number generator, where
a is 3 and c is 5:
 Y = (ax + c) modulo listSize
 = ( 3 * 1 + 5) Modulo 397
 = 8
Key Offset
 Double hashing method that produces different collision paths
for different keys
 Pseudorandom number generator produces a new address as a
function of previous address, key offset calculates the new
address as function of old address and key
offset = [ key/listsize]
address = ((offset + old address) modulo listSize)
Example
 When key is 166702 and list size is 307 using modulo
division hashing method generates address of 1
offset = [166702/307] = 543
address = ((543+001) modulo 307) =237
Key Offset
 If 237 were a collision, repeat the process to locate the next
address
offset = [166702/307] = 543
address = ((543+237) modulo 307) =166
Key Home
address
Key
offset
Probe 1 Probe 2
166702 1 543 237 166
572556 1 1865 024 047
067234 1 219 220 132
Linked list Collision Resolution
 Major disadvantage to open addressing is that each collision
resolution increases the probability of future collisions
 Eliminated by linked list approach
 Linked list is ordered collection of data in which element
contains the location of next element
Linked List Collision Resolution
[000]
[001]
[002]
[003]
[004]
[005]
[006]
[007]
[008]
[305]
[306]
379452 Marry Dodd
070918 Sarah Trapp
121267 Bryan Devaux
378845 Patrick Linn
160252 Tuan Ngo
045128 Feldman
166702 Harry Eagle
572556 ChrisWalljasper
Linked list Collision Resolution
 Use separate area to store collisions and chains together in
linked list
 Two storage areas: prime area and overflow area
 Each element in prime area contains additional field a link
header pointer to a linked list of overflow data in overflow
area
 When collision occurs, one element is stored in prime area and
chained to corresponding linked list in over flow area
 overflow area is typically implemented as linked list in
dynamic memory
Linked list Collision Resolution
 Linked list is stored in any order, but LIFO sequence or key
sequence
 LIFO sequence is fastest for insert because the linked list need
not be scanned to insert data
 Element being inserted into overflow is placed at beginning of
linked list and linked to node in prime area
 In key sequenced lists, key in prime area is smallest to provide
for faster search retrieval
Bucket Hashing
 Keys are hashed to bucket nodes that accommodate multiple
data occurrences
 Bucket hold multiple data, collisions are postponed until
bucket is full
Example
 Each address is large enough to hold data for three employees
 Collision will not occur until tried to add fourth employee to
address
Two problems
 Use more space because many of bucket are empty or partially
empty at any time
 It will not completely resolves collision problem
Bucket Hashing
379452 Marry Dodd
070918 Sarah Trapp
166702 Harry Eagle
367173 Ann Giorgis
121267 Byan Devaux
572556 Chris jasper
045128 Feldman
[000]
Bucket
0
[001]
Bucket
1
[002]
Bucket
2
[003]
Bucket
307
Combination Approaches
 There are several approaches to resolving collisions.
 As we saw with the hashing methods, a complex implementation often uses
multiple steps.
 Example:
 One large database implementation hashes to a bucket.
 If the bucket is full, it uses a set number of linear probes, such as three, to
resolve the collision and then uses a linked list overflow area.

More Related Content

What's hot

ip addressing_&_subnetting_made_easy
 ip addressing_&_subnetting_made_easy ip addressing_&_subnetting_made_easy
ip addressing_&_subnetting_made_easyManjit Singh
 
Logarithmic function, equation and inequality
Logarithmic function, equation and inequalityLogarithmic function, equation and inequality
Logarithmic function, equation and inequalityFelina Victoria
 
Logarithm presentation - By Your Powers Combined
Logarithm presentation - By Your Powers CombinedLogarithm presentation - By Your Powers Combined
Logarithm presentation - By Your Powers CombinedcurtishoierCSUF
 
digital logic circuits, digital component floting and fixed point
 digital logic circuits, digital component floting and fixed point digital logic circuits, digital component floting and fixed point
digital logic circuits, digital component floting and fixed pointRai University
 
Lec7 8 9_10 coding techniques
Lec7 8 9_10 coding techniquesLec7 8 9_10 coding techniques
Lec7 8 9_10 coding techniquesDom Mike
 
Digital Logic Design-Lecture 5
Digital Logic Design-Lecture 5Digital Logic Design-Lecture 5
Digital Logic Design-Lecture 5Samia Sultana
 
Data Reprersentation
Data Reprersentation  Data Reprersentation
Data Reprersentation Kamal Acharya
 
NETWORK LAYER - Logical Addressing
NETWORK LAYER - Logical AddressingNETWORK LAYER - Logical Addressing
NETWORK LAYER - Logical AddressingPankaj Debbarma
 
Chapter 11 - Sorting and Searching
Chapter 11 - Sorting and SearchingChapter 11 - Sorting and Searching
Chapter 11 - Sorting and SearchingEduardo Bergavera
 

What's hot (18)

Digital Logic
Digital LogicDigital Logic
Digital Logic
 
ip addressing_&_subnetting_made_easy
 ip addressing_&_subnetting_made_easy ip addressing_&_subnetting_made_easy
ip addressing_&_subnetting_made_easy
 
Logarithmic function, equation and inequality
Logarithmic function, equation and inequalityLogarithmic function, equation and inequality
Logarithmic function, equation and inequality
 
Logarithm presentation - By Your Powers Combined
Logarithm presentation - By Your Powers CombinedLogarithm presentation - By Your Powers Combined
Logarithm presentation - By Your Powers Combined
 
Compression ii
Compression iiCompression ii
Compression ii
 
digital logic circuits, digital component floting and fixed point
 digital logic circuits, digital component floting and fixed point digital logic circuits, digital component floting and fixed point
digital logic circuits, digital component floting and fixed point
 
Data structure algorithm
Data structure algorithmData structure algorithm
Data structure algorithm
 
Lecft3data
Lecft3dataLecft3data
Lecft3data
 
Lec7 8 9_10 coding techniques
Lec7 8 9_10 coding techniquesLec7 8 9_10 coding techniques
Lec7 8 9_10 coding techniques
 
Lecture3b searching
Lecture3b searchingLecture3b searching
Lecture3b searching
 
Digital Logic Design-Lecture 5
Digital Logic Design-Lecture 5Digital Logic Design-Lecture 5
Digital Logic Design-Lecture 5
 
Data Reprersentation
Data Reprersentation  Data Reprersentation
Data Reprersentation
 
Logarithms
LogarithmsLogarithms
Logarithms
 
NETWORK LAYER - Logical Addressing
NETWORK LAYER - Logical AddressingNETWORK LAYER - Logical Addressing
NETWORK LAYER - Logical Addressing
 
Chapter 11 - Sorting and Searching
Chapter 11 - Sorting and SearchingChapter 11 - Sorting and Searching
Chapter 11 - Sorting and Searching
 
Compression Ii
Compression IiCompression Ii
Compression Ii
 
Coding
CodingCoding
Coding
 
Lecture3a sorting
Lecture3a sortingLecture3a sorting
Lecture3a sorting
 

Similar to Data structure Unit-I Part-C

Hashing.pptx
Hashing.pptxHashing.pptx
Hashing.pptxkratika64
 
Hash in datastructures by using the c language.pptx
Hash in datastructures by using the c language.pptxHash in datastructures by using the c language.pptx
Hash in datastructures by using the c language.pptxmy6305874
 
Data Structures Design Notes.pdf
Data Structures Design Notes.pdfData Structures Design Notes.pdf
Data Structures Design Notes.pdfAmuthachenthiruK
 
358 33 powerpoint-slides_15-hashing-collision_chapter-15
358 33 powerpoint-slides_15-hashing-collision_chapter-15358 33 powerpoint-slides_15-hashing-collision_chapter-15
358 33 powerpoint-slides_15-hashing-collision_chapter-15sumitbardhan
 
DS Unit 1.pptx
DS Unit 1.pptxDS Unit 1.pptx
DS Unit 1.pptxchin463670
 
HASHING IS NOT YASH IT IS HASH.pptx
HASHING IS NOT YASH IT IS HASH.pptxHASHING IS NOT YASH IT IS HASH.pptx
HASHING IS NOT YASH IT IS HASH.pptxJITTAYASHWANTHREDDY
 
Hashing Technique In Data Structures
Hashing Technique In Data StructuresHashing Technique In Data Structures
Hashing Technique In Data StructuresSHAKOOR AB
 
Hashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdfHashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdfJaithoonBibi
 
Hashing techniques, Hashing function,Collision detection techniques
Hashing techniques, Hashing function,Collision detection techniquesHashing techniques, Hashing function,Collision detection techniques
Hashing techniques, Hashing function,Collision detection techniquesssuserec8a711
 
Sienna 9 hashing
Sienna 9 hashingSienna 9 hashing
Sienna 9 hashingchidabdu
 
Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Hashing notes data structures (HASHING AND HASH FUNCTIONS)Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Hashing notes data structures (HASHING AND HASH FUNCTIONS)Kuntal Bhowmick
 
Advance algorithm hashing lec II
Advance algorithm hashing lec IIAdvance algorithm hashing lec II
Advance algorithm hashing lec IISajid Marwat
 

Similar to Data structure Unit-I Part-C (20)

Hashing 1
Hashing 1Hashing 1
Hashing 1
 
Hashing .pptx
Hashing .pptxHashing .pptx
Hashing .pptx
 
Hashing.pptx
Hashing.pptxHashing.pptx
Hashing.pptx
 
Hash in datastructures by using the c language.pptx
Hash in datastructures by using the c language.pptxHash in datastructures by using the c language.pptx
Hash in datastructures by using the c language.pptx
 
Data Structures Design Notes.pdf
Data Structures Design Notes.pdfData Structures Design Notes.pdf
Data Structures Design Notes.pdf
 
Ch17 Hashing
Ch17 HashingCh17 Hashing
Ch17 Hashing
 
358 33 powerpoint-slides_15-hashing-collision_chapter-15
358 33 powerpoint-slides_15-hashing-collision_chapter-15358 33 powerpoint-slides_15-hashing-collision_chapter-15
358 33 powerpoint-slides_15-hashing-collision_chapter-15
 
DS Unit 1.pptx
DS Unit 1.pptxDS Unit 1.pptx
DS Unit 1.pptx
 
HASHING IS NOT YASH IT IS HASH.pptx
HASHING IS NOT YASH IT IS HASH.pptxHASHING IS NOT YASH IT IS HASH.pptx
HASHING IS NOT YASH IT IS HASH.pptx
 
Hashing Technique In Data Structures
Hashing Technique In Data StructuresHashing Technique In Data Structures
Hashing Technique In Data Structures
 
Hashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdfHashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdf
 
Hashing techniques, Hashing function,Collision detection techniques
Hashing techniques, Hashing function,Collision detection techniquesHashing techniques, Hashing function,Collision detection techniques
Hashing techniques, Hashing function,Collision detection techniques
 
Sienna 9 hashing
Sienna 9 hashingSienna 9 hashing
Sienna 9 hashing
 
Chapter 12 ds
Chapter 12 dsChapter 12 ds
Chapter 12 ds
 
Bigdata analytics
Bigdata analyticsBigdata analytics
Bigdata analytics
 
Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Hashing notes data structures (HASHING AND HASH FUNCTIONS)Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Hashing notes data structures (HASHING AND HASH FUNCTIONS)
 
Hashing
HashingHashing
Hashing
 
Advance algorithm hashing lec II
Advance algorithm hashing lec IIAdvance algorithm hashing lec II
Advance algorithm hashing lec II
 
Merge radix-sort-algorithm
Merge radix-sort-algorithmMerge radix-sort-algorithm
Merge radix-sort-algorithm
 
Merge radix-sort-algorithm
Merge radix-sort-algorithmMerge radix-sort-algorithm
Merge radix-sort-algorithm
 

More from SSN College of Engineering, Kalavakkam

More from SSN College of Engineering, Kalavakkam (20)

ECG
ECG ECG
ECG
 
Localization, Classification, and Evaluation.pdf
Localization, Classification, and Evaluation.pdfLocalization, Classification, and Evaluation.pdf
Localization, Classification, and Evaluation.pdf
 
ADBMS 3a
ADBMS   3aADBMS   3a
ADBMS 3a
 
Exercise 5
Exercise   5Exercise   5
Exercise 5
 
ADBMS Unit-II c
ADBMS Unit-II cADBMS Unit-II c
ADBMS Unit-II c
 
ADBMS Unit-II b
ADBMS Unit-II bADBMS Unit-II b
ADBMS Unit-II b
 
Database Management System - 2a
Database Management System - 2aDatabase Management System - 2a
Database Management System - 2a
 
Database Management System
Database Management SystemDatabase Management System
Database Management System
 
Unit III - Inventory Problems
Unit III - Inventory ProblemsUnit III - Inventory Problems
Unit III - Inventory Problems
 
Unit II B - Game Theory
Unit II B - Game TheoryUnit II B - Game Theory
Unit II B - Game Theory
 
Unit II A - Game Theory
Unit II A - Game TheoryUnit II A - Game Theory
Unit II A - Game Theory
 
Unit V - Queuing Theory
Unit V - Queuing TheoryUnit V - Queuing Theory
Unit V - Queuing Theory
 
Unit IV-Project Management
Unit IV-Project ManagementUnit IV-Project Management
Unit IV-Project Management
 
Unit I-B
Unit I-BUnit I-B
Unit I-B
 
Unit I-A
Unit I-AUnit I-A
Unit I-A
 
Web technology Unit-II Part-C
Web technology Unit-II Part-CWeb technology Unit-II Part-C
Web technology Unit-II Part-C
 
Data structure unit I part B
Data structure unit I part BData structure unit I part B
Data structure unit I part B
 
Web technology Unit-II Part A
Web technology Unit-II Part AWeb technology Unit-II Part A
Web technology Unit-II Part A
 
Data structure Unit-I Part A
Data structure Unit-I Part AData structure Unit-I Part A
Data structure Unit-I Part A
 
Web technology Unit-I Part E
Web technology Unit-I   Part EWeb technology Unit-I   Part E
Web technology Unit-I Part E
 

Recently uploaded

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 

Recently uploaded (20)

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 

Data structure Unit-I Part-C

  • 3. Basic Concepts  In a hashed search, the key, through an algorithmic function, determines the location of the data.  We use a hashing algorithm to transform the key into the index that contains the data we need to locate.  Another way to describe hashing is as a key-to-address transformation in which the keys map to addresses in a list.  Hashing is a key-to address mapping process
  • 4.
  • 5.
  • 6.
  • 7.  The address produced by the hashing algorithm is known as the home address.  We call the set of keys that hash to the same location in our list synonyms.  A collision occurs when a hashing algorithm produces an address for an insertion key and that address is already occupied.  The address produced by the hashing algorithm is known as the home address.  The memory that contains all of the home addresses is known as the prime area.  Each calculation of an address and test for success is known as a probe.
  • 8.
  • 9. Hashing Methods There are eight hashing methods they are:  Direct method  Substraction method  Modulo-division  Midsquare  Digit extraction  Rotation  Folding  Pseudorandom generation
  • 10.
  • 11. Direct Method:  In direct hashing the key is the address without any algorithmic manipulation.  Direct hashing is limited, but it can be very powerful because it guarantees that there are no synonyms and therefore no collision.
  • 12.
  • 13. Subtraction Method  Sometimes keys are consecutive but do not start from 1.  Example:  A company may have only 100 employees, but the employee numbers start from 1001 and go to 1100.  In this case we use subtraction hashing, a very simple hashing function that subtracts 1000 from the key to determine the address.  The direct and subtraction hash functions both guarantee a search effort of one with no collisions.  They are 'one-to-one hashing methods: only one key hashes to each address.
  • 14. Modulo-Division/Method:  Also known as division remainder, the modulo-division method divides the key by the array size and uses the remainder for the address.  This method gives us the simple hashing algorithm shown below in which listSize is the number of elements in the array:  Address = key MODULO listSize
  • 15.  Example:  Given data :  Keys are : 137456 214562 140145  137456 % 19 +1 = 11 214562 % 19 + 1 = 15 140145 % 19 + 1 = 2
  • 16.
  • 17. Digit-Extraction Method:  Using digit extraction selected digits are extracted from the key and used as the address.  Example:  Using our six-digit employee number to hash to a three digit address (000-999)  We could select the first, third, and fourth digits (from the left) and use them as the address.  379452 -> 394  121267 -> 112  378845 -> 388  160252 -> 102  045128 -> 051
  • 18. Mid Square Method  In mid square hashing the key is squared and the address is selected from the middle of the square number.  Limitation is the size of the key.  Example: 94522 = 89340304: address is 3403  379452: 379 * 379 = 143641 -> 364  121267: 121 * 121 = 014641 -> 464  378845: 378 * 378 = 142884 -> 288  160252: 160 * 160 = 025600 -> 560  045128: 045 * 045 = 002025 -> 202  The same digits must be selected from the product.
  • 19. Folding Method Two folding methods are used they are:  Fold shift  Fold boundary Fold Shift  In fold shift the key value is divided into parts whose size matches the size of the required address.  Then the left and right parts are shifted and added with the middle part. Fold boundary  In fold boundary the left and right numbers are folded on a fixed boundary between them and the center number.  The two outside values are thus reversed.
  • 20.
  • 21. Rotation Method  Rotation method is generally not used by itself but rather is incorporated in combination with other hashing methods.  It is most useful when keys are assigned serially.  A simple hashing algorithm tends to create synonyms when hashing keys are identical except for the last character.  Rotating the last character to the front of the key minimizes this effect.
  • 22.
  • 23. Pseudorandom method  A common random-number generator is shown below. y= ax + c  To use the pseudorandom-number generator as a hashing method, we set x to the key, multiply it by the coefficient a, and then add the constant c.  The result is then divided by the list size, with the remainder being the hashed address. Example: Y= ((17 * 121267) + 7) modulo 307 Y= (2061539 + 7) modulo 307 Y= 2061546 Y=41
  • 24. Hashing algorithm  The hashing methods may work well when we hash a key to an address in an array, hashing to large files is generally more complex.  We have an alphanumeric key consisting of up to 30 bytes that we need to hash into a 32-bit address.  Step 1: To convert alphanumeric key into a number key by adding the American Standard Code for Information Interchange (ASCII) value for each character to an accumulator that will be the address.  Step 2: As each character is added, we rotate the bits in the address to maximize the distribution of the values.  Step 3: After the characters in the key have been completely hashed, we take the absolute value of the address and then map it into the address range for the file.
  • 25.
  • 26. Analysis First:  The rotation can often be accomplished by an assembly language instruction.  If the algorithm is written in a high-level language, then the rotation is accomplished by a series of bitwise and instructions.  For out purposes, it is sufficient that the 12 bits at the end of the address are shifted to be the 12 bits at the beginning of the address and the bits at the beginning are shifted to occupy the bit locations at the right. Second:  This algorithm actually uses three of the hashing methods.  Finally, we use modulo division when we map the hashed address into the range of available addresses.
  • 27. Collision Resolution  With the exception of the direct and subtraction methods, none of the methods used for hashing are one-to-one mapping.  Thus, when we hash a new key to an address, we may create a collision.  A collision occurs when a hashing algorithm produces an address for an insertion key and that address is already occupied.  There are several methods for handling collisions, each of them independent of the hashing algorithm.
  • 28.
  • 29. Concepts  The load factor of a hashed list is the number of elements in the list divided by the number of physical elements allocated for the list, expressed as a percentage.  Traditionally, load factor is assigned the symbol alpha (α).  The formula in which k repesents the number of filled elements in the list and n represents the total number of elements allocated to the list is  a = ( k / n ) * 100
  • 30. Computer scientists have identified two distinct types of clusters.  (i) Primary clustering occurs when data cluster around a home address. Primary clustering is easy to identify.  (ii) Secondary clustering occurs when data become grouped along a collision throughout a list. This type of clustering is not easy to identify.  There are two different approaches to resolving collisions:  Open addressing  Linked lists.
  • 31. Open Addressing  The first collision resolution method, open addressing, resolves collisions in the prime area-that is, the area that contains all of the home addresses.  When a collision occurs, the prime area addresses are searched for an 0 or unoccupied element where the new data can be placed.
  • 32. Linear Probe  In a linear probe, which is the simplest, when data cannot be stored in the home address we resolve the collision by adding 1 to the current address.  However, this address is also filled.  We therefore add another 1 to the address and this time find an empty location.  Advantages:  First: they are quite simple to implement.  Second: data tend to remain near their home address.
  • 33.
  • 34. Quadratic Probe  Primary clustering, although not necessarily secondary clustering, can be eliminated by adding a value other than 1 to the current address.  One easily implemented method is to use the quadratic probe.  Disadvantage:  It is time required to square the probe number.  We can eliminate the multiply factor, however, by using an increment factor that increases by 2 each probe.  Adding the increment factor to the previous increment gives us the next increment.  The quadratic probe has one limitation:  It is not possible to generate a new address for every element in the list.
  • 35.
  • 36. Pseudo random Collision Resolution  The last two open addressing methods ( Linear Probe and Quadratic Probe) methods are collectively known as double hashing.  In each method, rather than using an arithmetic probe function, the address is rehashed.  Pseudorandom collision resolution uses a pseudorandom number to resolve the collision.  We now use it a collision resolution method. In this case, rather than use the key as a factor in the random-number calculation, we use the collision address.  We now resolve the collision using the following pseudorandom-number generator, where a is 3 and c is 5:  Y = (ax + c) modulo listSize  = ( 3 * 1 + 5) Modulo 397  = 8
  • 37. Key Offset  Double hashing method that produces different collision paths for different keys  Pseudorandom number generator produces a new address as a function of previous address, key offset calculates the new address as function of old address and key offset = [ key/listsize] address = ((offset + old address) modulo listSize) Example  When key is 166702 and list size is 307 using modulo division hashing method generates address of 1 offset = [166702/307] = 543 address = ((543+001) modulo 307) =237
  • 38. Key Offset  If 237 were a collision, repeat the process to locate the next address offset = [166702/307] = 543 address = ((543+237) modulo 307) =166 Key Home address Key offset Probe 1 Probe 2 166702 1 543 237 166 572556 1 1865 024 047 067234 1 219 220 132
  • 39. Linked list Collision Resolution  Major disadvantage to open addressing is that each collision resolution increases the probability of future collisions  Eliminated by linked list approach  Linked list is ordered collection of data in which element contains the location of next element
  • 40. Linked List Collision Resolution [000] [001] [002] [003] [004] [005] [006] [007] [008] [305] [306] 379452 Marry Dodd 070918 Sarah Trapp 121267 Bryan Devaux 378845 Patrick Linn 160252 Tuan Ngo 045128 Feldman 166702 Harry Eagle 572556 ChrisWalljasper
  • 41. Linked list Collision Resolution  Use separate area to store collisions and chains together in linked list  Two storage areas: prime area and overflow area  Each element in prime area contains additional field a link header pointer to a linked list of overflow data in overflow area  When collision occurs, one element is stored in prime area and chained to corresponding linked list in over flow area  overflow area is typically implemented as linked list in dynamic memory
  • 42. Linked list Collision Resolution  Linked list is stored in any order, but LIFO sequence or key sequence  LIFO sequence is fastest for insert because the linked list need not be scanned to insert data  Element being inserted into overflow is placed at beginning of linked list and linked to node in prime area  In key sequenced lists, key in prime area is smallest to provide for faster search retrieval
  • 43. Bucket Hashing  Keys are hashed to bucket nodes that accommodate multiple data occurrences  Bucket hold multiple data, collisions are postponed until bucket is full Example  Each address is large enough to hold data for three employees  Collision will not occur until tried to add fourth employee to address Two problems  Use more space because many of bucket are empty or partially empty at any time  It will not completely resolves collision problem
  • 44. Bucket Hashing 379452 Marry Dodd 070918 Sarah Trapp 166702 Harry Eagle 367173 Ann Giorgis 121267 Byan Devaux 572556 Chris jasper 045128 Feldman [000] Bucket 0 [001] Bucket 1 [002] Bucket 2 [003] Bucket 307
  • 45. Combination Approaches  There are several approaches to resolving collisions.  As we saw with the hashing methods, a complex implementation often uses multiple steps.  Example:  One large database implementation hashes to a bucket.  If the bucket is full, it uses a set number of linear probes, such as three, to resolve the collision and then uses a linked list overflow area.