Your SlideShare is downloading. ×
Ds 8
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Ds 8

569
views

Published on

Published in: Technology, Education

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
569
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Lower Bound and Upper Bound to denote the first element number and the last element number respectively
  • Lower Bound and Upper Bound to denote the first element number and the last element number respectively
  • Transcript

    • 1. Introduction to Hashing TechniquesObjectivesIn this lesson, you will learn to: Randomly access data by using a hash index Implement Hashing function Define different hashing techniques Define collision and how collisions are handled Create a hash index and use it, to randomly access data from a file, using the key field Introduction to Hashing Techniques/Lesson 8/Slide 1 of 21 ©NIIT
    • 2. Introduction to Hashing Techniques Hashing  It means converting a key to an address to retrieve a record.  Given a key, the offset of the record can be calculated with the following formula: Key * Record length Introduction to Hashing Techniques/Lesson 8/Slide 2 of 21 ©NIIT
    • 3. Introduction to Hashing Techniques Hashing Functions  Given a key, the hash function converts it into a hash value (location) within the range 1 -n, where n is the size of the storage (address) space that has been allocated for the records.  The record is then retrieved at the location generated.  Dividing is one of the commonly used hashing function. Introduction to Hashing Techniques/Lesson 8/Slide 3 of 21 ©NIIT
    • 4. Introduction to Hashing Techniques Hashing Techniques  Two hashing techniques commonly employed are:  Hash indexes  Hash tables  The goal of both the techniques is same, which is as follows:  To identify the location of a data record in a file using a key, to address transformation. Introduction to Hashing Techniques/Lesson 8/Slide 4 of 21 ©NIIT
    • 5. Introduction to Hashing Techniques Hash Indexes  An index created by placing keys in locations calculated using a hashing function is called a hash index file.  It contains a key-offset pair corresponding to each record in the data file.  Following two files are used in the technique employing hash index:  A file containing the data records, and  A hash index file. Introduction to Hashing Techniques/Lesson 8/Slide 5 of 21 ©NIIT
    • 6. Introduction to Hashing Techniques Hash Tables  Hash tables make use of data files only.  It involves calculation of a location based on the value of a key.  In this method, a whole record is inserted into the calculated position in the data file, i.e. the hash table. Introduction to Hashing Techniques/Lesson 8/Slide 6 of 21 ©NIIT
    • 7. Introduction to Hashing Techniques Collisions  An attempt to store two keys at the same position is known as collision.  It will occur irrespective of the hashing function used. Introduction to Hashing Techniques/Lesson 8/Slide 7 of 21 ©NIIT
    • 8. Introduction to Hashing Techniques Collision Processing Rehashing  This method involves using a secondary hash function, called a rehashed function, on the hash value of the key.  The rehash function is applied successively until an empty position is found. Introduction to Hashing Techniques/Lesson 8/Slide 8 of 21 ©NIIT
    • 9. Introduction to Hashing Techniques Collision Processing (Contd..) Chaining  This method uses links (pointers) to resolve hash clashes.  Two chaining techniques are:  Coalesed Chaining  Separate Chaining Introduction to Hashing Techniques/Lesson 8/Slide 9 of 21 ©NIIT
    • 10. Introduction to Hashing Techniques Coalesed Chaining  It completely eliminates the possibility that more than one collision will occur even for the same hash value.  It requires the storage area to be divided into two parts:  A prime hash area  An overflow area Introduction to Hashing Techniques/Lesson 8/Slide 10 of 21 ©NIIT
    • 11. Introduction to Hashing Techniques Separate Chaining  In this method, an array of header nodes is used.  Each element in the array is a pointer, which stores the address of a distinct linked list.  Each linked list is a list of records whose keys have the same hash values.  When a record has to be retrieved, the hashing function converts the given key to yield a position (subscript) in the array. Introduction to Hashing Techniques/Lesson 8/Slide 11 of 21 ©NIIT
    • 12. Introduction to Hashing Techniques Bucket Hashing  The hashing of a key yield the position of a storage area in which several key entries can be stored. This storage area is called a bucket.  The file is divided into a number of such buckets. Each bucket has enough space to store multiple values.  When a record has to be retrieved, its key is hashed to give an offset. This offset is a bucket offset. Then the bucket is read into internal memory and searched sequentially. Introduction to Hashing Techniques/Lesson 8/Slide 12 of 21 ©NIIT
    • 13. Introduction to Hashing Techniques Hash Indexes Vs Hash Tables  The choice of hashing method depends on the following factors:  Data Organization  Access Speed  Disk Space Requirement Introduction to Hashing Techniques/Lesson 8/Slide 13 of 21 ©NIIT
    • 14. Introduction to Hashing Techniques An Example to Illustrate the Use of A Hash Table The assumptions made in this example are:  The key is an alphanumeric field, the first byte of which is an alphabet.  Only one key exists for a particular hash value.  The problem of collisions is not being addressed.  The file structure assumed is shown below: Field Length Type city 10 String Population 2 Integer Introduction to Hashing Techniques/Lesson 8/Slide 14 of 21 ©NIIT
    • 15. Introduction to Hashing Techniques An Example to Illustrate the Use of A Hash Table (Contd..)  The hashing algorithm used is as follows: the first letter from the alphabetic key is extracted and the position of this letter in the alphabet is used as the hash value. If the first letter in the key is C, then the hash value is 3. Thus, it is obvious that the number of positions (or buckets) that a key might hash to is 26, which is the number of letters in the alphabet.  The processing required to create the hash table involves the following steps:  Creating file space Introduction to Hashing Techniques/Lesson 8/Slide 15 of 21 ©NIIT
    • 16. Introduction to Hashing Techniques An Example to Illustrate the Use of A Hash Table (Contd..)  Accepting Data  Find the correct bucket  Writing to the hash table Introduction to Hashing Techniques/Lesson 8/Slide 16 of 21 ©NIIT
    • 17. Introduction to Hashing Techniques Problem Statement 8.D.1  Create a hash table for records having structure given below: Field Size City 10 Population 2 The hashing algorithm used is as follows: the first letter from the alphabetic is used as the hash value. If the first letter in the key is C, then the hash value is 3. Thus, it is obvious that the number of positions(or buckets) that a key might hash to is 26, which is the number of letters in the alphabet. Introduction to Hashing Techniques/Lesson 8/Slide 17 of 21 ©NIIT
    • 18. Introduction to Hashing Techniques Problem Statement 8.D.1 (Contd..) only one key exists for a particular hash value. In other words, there is only one record per bucket. The problem of collisions is not being addressed. The key is an alphanumeric field, the first byte of which is an alphabet. Introduction to Hashing Techniques/Lesson 8/Slide 18 of 21 ©NIIT
    • 19. Introduction to Hashing TechniquesSummaryIn this lesson, you learned that: Hashing is a technique used to access data stored in files Using hashing techniques, it is possible to calculate the position of a record in a data file from its key field value The main purpose of hashing is to eliminate unnecessary searching by using the method of direct access to retrieve a record. This is done by transforming the key to yield the offset of the record An algorithm called, a hashing function, is used to perform the key to address transformation Introduction to Hashing Techniques/Lesson 8/Slide 19 of 21 ©NIIT
    • 20. Introduction to Hashing TechniquesSummary (Contd..) It often happens that more than one key hashes to the same hash value resulting in collision. AS a result, an attempt is made to store two records in one location. Collisions are processed by using various algorithms, three of which are:  Rehashing  Linked list collision processing  Bucket hashing Introduction to Hashing Techniques/Lesson 8/Slide 20 of 21 ©NIIT
    • 21. Introduction to Hashing TechniquesSummary (Contd..) Several hashing strategies can be employed, two of which are:  Hash indexing  Hash tables Introduction to Hashing Techniques/Lesson 8/Slide 21 of 21 ©NIIT