Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

No Downloads

Total views

423

On SlideShare

0

From Embeds

0

Number of Embeds

1

Shares

0

Downloads

11

Comments

0

Likes

4

No embeds

No notes for slide

Before this lecture, students should have seen other forms of a Dictionary, where a collection of data is stored, and each data item has a key associated with it.

What we do now is use an array to implement the dictionary. The array is an array of records. In this example, we could store up to 701 records in the array.

Oftentimes, the empty spots are identified by a special key. For example, if all our identification numbers are positive, then we could use 0 as the Number that indicates an empty spot.

With this drawing, locations [0], [4], [6], and maybe some others would all have Number=0.

a. Take the key mod 701 (which could be anywhere from 0 to 700).

So, quick, what is (580,625,685 mod 701) ?

When a collision occurs, the insertion process will move forward through the array until an empty spot is found. Sometimes you will have a second collision...

What happens if a search reaches an empty spot? In that case, it can

halt and indicate that the key was not in the hash table.

We might do this by using some other special value for the Number field of the record.

In any case, a search can not stop when it reaches &quot;a location that used to have something here&quot;. A search can only stop when it reaches a true empty spot.

- 1. SEMINAR ON HASHING PRESENTED TO: PRESENTED BY: MS. MANISHA RUCHIKA(6) RITIKA(11) MCA 1 S.S.D WOMEN’S INSTITUTE OF TECHNOLOGY, BATHINDA (AFFILIATED TO PUNJABI UNIVERSITY,PATIALA)
- 2. CONTENTS What is hashing?What is hashing? Inserting ,deleting and searching a recordInserting ,deleting and searching a record hash functions and their methodshash functions and their methods CollisionCollision Two ways of collision resolutionTwo ways of collision resolution 1.1. open addressingopen addressing 2.2. ChainingChaining
- 3. Hashing are a common approach to the storing/searching problem. A collection of data is stored ,and each data item has a key associated with it. Hashing
- 4. What is a Hash Table ? The simplest kind of hash table is an array of records. This example has 701 records. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] An array of records . . . [ 700]
- 5. What is a Hash Table ? Each record has a special field, called its key. In this example, the key is a long integer field called Number. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] . . . [ 700] [ 4 ] Number 506643548
- 6. What is a Hash Table ? The number might be a person's identification number, and the rest of the record has information about the person. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] . . . [ 700] [ 4 ] Number 506643548
- 7. What is a Hash Table ? When a hash table is in use, some spots contain valid records, and other spots are "empty". [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . .
- 8. The general idea of using theThe general idea of using the keykey to determine the address of ato determine the address of a record is an excellent idea, but it must be modified so that a greatrecord is an excellent idea, but it must be modified so that a great deal of space is not wasted.deal of space is not wasted. This modification takes the form of a function H from the set K ofThis modification takes the form of a function H from the set K of keys in to the set L of memory addresses.keys in to the set L of memory addresses. H : KH : K L (hash function)L (hash function) Chop is a technique, in which combine the pieces of key K to formChop is a technique, in which combine the pieces of key K to form the hash address H(k).the hash address H(k). key key1 key2 Key1.1 Key1.2
- 9. Methods of Hash Functions Division methodDivision method:- Choose a number m larger than the:- Choose a number m larger than the number n of keys in k.number n of keys in k. The hash function H is defined byThe hash function H is defined by H(k)=k(mod m) or H(k)=k(mod m) +1H(k)=k(mod m) or H(k)=k(mod m) +1 k(mod m) denotes the remainder when k is divided by m.k(mod m) denotes the remainder when k is divided by m. Second formula is used when we want the hash addressesSecond formula is used when we want the hash addresses to range from 1 to m. rather than o to m-1to range from 1 to m. rather than o to m-1..
- 10. Example of division method 68 employees is assigned a unique 4-digit employee68 employees is assigned a unique 4-digit employee number. Suppose L consists of 100 two-digit addresses.number. Suppose L consists of 100 two-digit addresses. 00,01,02,……99. we apply the division method to each of00,01,02,……99. we apply the division method to each of the employee number: 3205, 7148, 2345.the employee number: 3205, 7148, 2345. 1. Choose a prime number m close to 99 such as m=97. then1. Choose a prime number m close to 99 such as m=97. then H(3205)=4 , H(7148)=67 , H(2345)=17.H(3205)=4 , H(7148)=67 , H(2345)=17. 2. H(3205)=4+1=52. H(3205)=4+1=5 H(7148)=67+1=68H(7148)=67+1=68 H(2345)=17+1=18H(2345)=17+1=18
- 11. Mid square method The key k is squared. Than the hash function H is definedThe key k is squared. Than the hash function H is defined byby H(k)=lH(k)=l Here l is obtained by deleting digits from both ends of k*k.Here l is obtained by deleting digits from both ends of k*k. Example:Example: k: 3205k: 3205 71487148 23452345 k*k: 10 272 025k*k: 10 272 025 51 093 90451 093 904 5 499 0255 499 025 H(k): 72H(k): 72 9393 9999
- 12. Folding method The key k is partitioned into a number of parts, k1,k2,The key k is partitioned into a number of parts, k1,k2, ….kr.….kr. Parts are added together, ignoring the last carry.Parts are added together, ignoring the last carry. H(k)=k1+k2+……+krH(k)=k1+k2+……+kr Example:-Example:- H(3205)=32+05=37H(3205)=32+05=37 H(7148)=71+48=19 (the leading digit 1 in this is ignored).H(7148)=71+48=19 (the leading digit 1 in this is ignored). H(2345)=23+45=68H(2345)=23+45=68
- 13. Inserting a New Record In order to insert a new record, the key must somehow be converted to an array index. The index is called the hash value of the key. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685
- 14. Inserting a New Record Typical way create a hash value: [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 (Number mod 701) What is (580625685 mod 701) ?
- 15. Inserting a New Record Typical way to create a hash value: [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 (Number mod 701) What is (580625685 mod 701) ? 3
- 16. Inserting a New Record The hash value is used for the location of the new record. Number 580625685 [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . [3]
- 17. Inserting a New Record The hash value is used for the location of the new record. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685
- 18. Collisions Here is another new record to insert, with a hash value of 2. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868 My hash value is [2].
- 19. Collisions This is called a collision, because there is already another valid record at [2]. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868 When a collision occurs, move forward until you find an empty spot.
- 20. Collisions This is called a collision, because there is already another valid record at [2]. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868 When a collision occurs, move forward until you find an empty spot.
- 21. Collisions This is called a collision, because there is already another valid record at [2]. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868 When a collision occurs, move forward until you find an empty spot.
- 22. Collisions This is called a collision, because there is already another valid record at [2]. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868 The new record goes in the empty spot.
- 23. Searching for a Key The data that's attached to a key can be found fairly quickly. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868 Number 701466868
- 24. Searching for a Key Calculate the hash value. Check that location of the array for the key. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868 Number 701466868 My hash value is [2]. Not me.
- 25. Searching for a Key Keep moving forward until you find the key, or you reach an empty spot. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868 Number 701466868 My hash value is [2]. Not me.
- 26. Searching for a Key Keep moving forward until you find the key, or you reach an empty spot. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868 Number 701466868 My hash value is [2]. Not me.
- 27. Searching for a Key Keep moving forward until you find the key, or you reach an empty spot. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868 Number 701466868 My hash value is [2]. Yes!
- 28. Searching for a Key When the item is found, the information can be copied to the necessary location. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868 Number 701466868 My hash value is [2]. Yes!
- 29. Deleting a Record Records may also be deleted from a hash table. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 506643548Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868 Please delete me.
- 30. Deleting a Record Records may also be deleted from a hash table. But the location must not be left as an ordinary "empty spot" since that could interfere with searches. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868
- 31. Deleting a Record [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 233667136Number 281942902 Number 155778322 . . . Number 580625685 Number 701466868 Records may also be deleted from a hash table. But the location must not be left as an ordinary "empty spot" . The location must be marked in some special way so that a search can tell that the spot used to have something in it.
- 32. COLLISION RESOLUTION There are two general ways of resolving collisions.There are two general ways of resolving collisions. The particular procedure that one chooses depends on manyThe particular procedure that one chooses depends on many factors. One important factor is the ratio of the number n of keysfactors. One important factor is the ratio of the number n of keys in K(which is the number of the record in F) to the number m ofin K(which is the number of the record in F) to the number m of hash addresses in L. this ratio, = n/m is called load factor.hash addresses in L. this ratio, = n/m is called load factor. n= number of filled records.n= number of filled records. m= total number of memory locations.m= total number of memory locations.
- 33. The efficiency ofThe efficiency of hash functionhash function with a collision resolutionwith a collision resolution procedure is measured by the average number ofprocedure is measured by the average number of probes(key comparisons) needed to find the location of theprobes(key comparisons) needed to find the location of the record with a given key k.the efficiency depends mainly onrecord with a given key k.the efficiency depends mainly on the load factor . The following two quantities:the load factor . The following two quantities: S( )S( ) = average number of probes for a successful search.= average number of probes for a successful search. U( ) = average number of probes for an unsuccessful search.U( ) = average number of probes for an unsuccessful search.
- 34. Open addressing If the home slot for the record that is being inserted is alreadyIf the home slot for the record that is being inserted is already occupied, then simply choose a different location with in the tableoccupied, then simply choose a different location with in the table.. But…how do we choose this alternate location? The technique must be reproducible, and on average be cheap.
- 35. Linear Probing Linear probing involves simply walking down the table until anLinear probing involves simply walking down the table until an empty slot is foundempty slot is found..
- 36. Drawbacks of linear probing The major drawbacks of linear probing is that, as the tableThe major drawbacks of linear probing is that, as the table becomes about half full, here is a tendency towardbecomes about half full, here is a tendency toward clusteringclustering.. The sequential searches needed to find an empty positionThe sequential searches needed to find an empty position become longer and longer.become longer and longer. Example:- if a new insertion hashes to location b, then it willExample:- if a new insertion hashes to location b, then it will go there, but if it hashes to location a, then it will also gogo there, but if it hashes to location a, then it will also go into b.into b. a b c d e
- 37. The problem of clustering is essentially one ofThe problem of clustering is essentially one of instabilityinstability.. If a few keys happen randomly to be near each other, thenIf a few keys happen randomly to be near each other, then it becomes more and more likely that other keys will joinit becomes more and more likely that other keys will join them, and the distribution will become progressively morethem, and the distribution will become progressively more unbalancedunbalanced..
- 38. Avoid the problem of clustering Quadratic probing:-Quadratic probing:- suppose a record R with key k has thesuppose a record R with key k has the hash address H(k)=h. then, instead of searching the locationshash address H(k)=h. then, instead of searching the locations with addresses h, h+1, h+2,…. We linearly search thewith addresses h, h+1, h+2,…. We linearly search the locations with addresses.locations with addresses. h, h+1, h+4, h+9, h+16,….h+i^2…h, h+1, h+4, h+9, h+16,….h+i^2… If the number m of locations in the table T is a prime number,If the number m of locations in the table T is a prime number, then the above sequence will access half of the locations in T.then the above sequence will access half of the locations in T.
- 39. Double Hashing A second hash function H’ is used for resolving aA second hash function H’ is used for resolving a collision. Suppose a record R with key k has the hashcollision. Suppose a record R with key k has the hash addresses H(k)=h and H’(k)= h’ m then we linearlyaddresses H(k)=h and H’(k)= h’ m then we linearly search the locations with addresses.search the locations with addresses. h, h+h’, h+2h’, h+3h’,….h, h+h’, h+2h’, h+3h’,…. if m is a prime number , then the above sequence all theif m is a prime number , then the above sequence all the locations in the table T.locations in the table T.
- 40. Chaining Design the table so that each slot is actually a container that can hold multiple records. Here, the “chains" are linked lists which could hold any number of colliding records. Alternatively each table slot could be large enough to store several records directly…in that case the slot may overflow, requiring a fallback…
- 41. EXAMPLE LIST 8 3 0 5 7 0 0 2 0 0 6 INFO LINK A 0 B 0 C 0 D 0 E 1 X 4 Y 0 Z 0 10 11 1 2 3 4 6 5 7 8 9 10

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment