Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

972 views

Published on

No Downloads

Total views

972

On SlideShare

0

From Embeds

0

Number of Embeds

8

Shares

0

Downloads

0

Comments

0

Likes

1

No embeds

No notes for slide

- 1. Introduction to Hashing TechniquesObjectivesIn this lesson, you will learn to: Randomly access data by using a hash index Implement Hashing function Define different hashing techniques Define collision and how collisions are handled Create a hash index and use it, to randomly access data from a file, using the key field Introduction to Hashing Techniques/Lesson 8/Slide 1 of 21 ©NIIT
- 2. Introduction to Hashing Techniques Hashing It means converting a key to an address to retrieve a record. Given a key, the offset of the record can be calculated with the following formula: Key * Record length Introduction to Hashing Techniques/Lesson 8/Slide 2 of 21 ©NIIT
- 3. Introduction to Hashing Techniques Hashing Functions Given a key, the hash function converts it into a hash value (location) within the range 1 -n, where n is the size of the storage (address) space that has been allocated for the records. The record is then retrieved at the location generated. Dividing is one of the commonly used hashing function. Introduction to Hashing Techniques/Lesson 8/Slide 3 of 21 ©NIIT
- 4. Introduction to Hashing Techniques Hashing Techniques Two hashing techniques commonly employed are: Hash indexes Hash tables The goal of both the techniques is same, which is as follows: To identify the location of a data record in a file using a key, to address transformation. Introduction to Hashing Techniques/Lesson 8/Slide 4 of 21 ©NIIT
- 5. Introduction to Hashing Techniques Hash Indexes An index created by placing keys in locations calculated using a hashing function is called a hash index file. It contains a key-offset pair corresponding to each record in the data file. Following two files are used in the technique employing hash index: A file containing the data records, and A hash index file. Introduction to Hashing Techniques/Lesson 8/Slide 5 of 21 ©NIIT
- 6. Introduction to Hashing Techniques Hash Tables Hash tables make use of data files only. It involves calculation of a location based on the value of a key. In this method, a whole record is inserted into the calculated position in the data file, i.e. the hash table. Introduction to Hashing Techniques/Lesson 8/Slide 6 of 21 ©NIIT
- 7. Introduction to Hashing Techniques Collisions An attempt to store two keys at the same position is known as collision. It will occur irrespective of the hashing function used. Introduction to Hashing Techniques/Lesson 8/Slide 7 of 21 ©NIIT
- 8. Introduction to Hashing Techniques Collision Processing Rehashing This method involves using a secondary hash function, called a rehashed function, on the hash value of the key. The rehash function is applied successively until an empty position is found. Introduction to Hashing Techniques/Lesson 8/Slide 8 of 21 ©NIIT
- 9. Introduction to Hashing Techniques Collision Processing (Contd..) Chaining This method uses links (pointers) to resolve hash clashes. Two chaining techniques are: Coalesed Chaining Separate Chaining Introduction to Hashing Techniques/Lesson 8/Slide 9 of 21 ©NIIT
- 10. Introduction to Hashing Techniques Coalesed Chaining It completely eliminates the possibility that more than one collision will occur even for the same hash value. It requires the storage area to be divided into two parts: A prime hash area An overflow area Introduction to Hashing Techniques/Lesson 8/Slide 10 of 21 ©NIIT
- 11. Introduction to Hashing Techniques Separate Chaining In this method, an array of header nodes is used. Each element in the array is a pointer, which stores the address of a distinct linked list. Each linked list is a list of records whose keys have the same hash values. When a record has to be retrieved, the hashing function converts the given key to yield a position (subscript) in the array. Introduction to Hashing Techniques/Lesson 8/Slide 11 of 21 ©NIIT
- 12. Introduction to Hashing Techniques Bucket Hashing The hashing of a key yield the position of a storage area in which several key entries can be stored. This storage area is called a bucket. The file is divided into a number of such buckets. Each bucket has enough space to store multiple values. When a record has to be retrieved, its key is hashed to give an offset. This offset is a bucket offset. Then the bucket is read into internal memory and searched sequentially. Introduction to Hashing Techniques/Lesson 8/Slide 12 of 21 ©NIIT
- 13. Introduction to Hashing Techniques Hash Indexes Vs Hash Tables The choice of hashing method depends on the following factors: Data Organization Access Speed Disk Space Requirement Introduction to Hashing Techniques/Lesson 8/Slide 13 of 21 ©NIIT
- 14. Introduction to Hashing Techniques An Example to Illustrate the Use of A Hash Table The assumptions made in this example are: The key is an alphanumeric field, the first byte of which is an alphabet. Only one key exists for a particular hash value. The problem of collisions is not being addressed. The file structure assumed is shown below: Field Length Type city 10 String Population 2 Integer Introduction to Hashing Techniques/Lesson 8/Slide 14 of 21 ©NIIT
- 15. Introduction to Hashing Techniques An Example to Illustrate the Use of A Hash Table (Contd..) The hashing algorithm used is as follows: the first letter from the alphabetic key is extracted and the position of this letter in the alphabet is used as the hash value. If the first letter in the key is C, then the hash value is 3. Thus, it is obvious that the number of positions (or buckets) that a key might hash to is 26, which is the number of letters in the alphabet. The processing required to create the hash table involves the following steps: Creating file space Introduction to Hashing Techniques/Lesson 8/Slide 15 of 21 ©NIIT
- 16. Introduction to Hashing Techniques An Example to Illustrate the Use of A Hash Table (Contd..) Accepting Data Find the correct bucket Writing to the hash table Introduction to Hashing Techniques/Lesson 8/Slide 16 of 21 ©NIIT
- 17. Introduction to Hashing Techniques Problem Statement 8.D.1 Create a hash table for records having structure given below: Field Size City 10 Population 2 The hashing algorithm used is as follows: the first letter from the alphabetic is used as the hash value. If the first letter in the key is C, then the hash value is 3. Thus, it is obvious that the number of positions(or buckets) that a key might hash to is 26, which is the number of letters in the alphabet. Introduction to Hashing Techniques/Lesson 8/Slide 17 of 21 ©NIIT
- 18. Introduction to Hashing Techniques Problem Statement 8.D.1 (Contd..) only one key exists for a particular hash value. In other words, there is only one record per bucket. The problem of collisions is not being addressed. The key is an alphanumeric field, the first byte of which is an alphabet. Introduction to Hashing Techniques/Lesson 8/Slide 18 of 21 ©NIIT
- 19. Introduction to Hashing TechniquesSummaryIn this lesson, you learned that: Hashing is a technique used to access data stored in files Using hashing techniques, it is possible to calculate the position of a record in a data file from its key field value The main purpose of hashing is to eliminate unnecessary searching by using the method of direct access to retrieve a record. This is done by transforming the key to yield the offset of the record An algorithm called, a hashing function, is used to perform the key to address transformation Introduction to Hashing Techniques/Lesson 8/Slide 19 of 21 ©NIIT
- 20. Introduction to Hashing TechniquesSummary (Contd..) It often happens that more than one key hashes to the same hash value resulting in collision. AS a result, an attempt is made to store two records in one location. Collisions are processed by using various algorithms, three of which are: Rehashing Linked list collision processing Bucket hashing Introduction to Hashing Techniques/Lesson 8/Slide 20 of 21 ©NIIT
- 21. Introduction to Hashing TechniquesSummary (Contd..) Several hashing strategies can be employed, two of which are: Hash indexing Hash tables Introduction to Hashing Techniques/Lesson 8/Slide 21 of 21 ©NIIT

No public clipboards found for this slide

Be the first to comment