MELJUN CORTES Jedi slides data st-chapter10-hashing
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

MELJUN CORTES Jedi slides data st-chapter10-hashing

  • 517 views
Uploaded on

MELJUN CORTES Jedi slides data st-chapter10-hashing

MELJUN CORTES Jedi slides data st-chapter10-hashing

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
517
On Slideshare
517
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
16
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. 10 Hash Table and Hashing Techniques Data Structures – Hashing 1
  • 2. ObjectivesAt the end of the lesson, the student should be able to:● Define hashing and explain how hashing works● Implement simple hashing techniques● Discuss how collisions are avoided/minimized by using collision resolution techniques● Explain the concepts behind dynamic files and hashing Data Structures – Hashing 2
  • 3. Introduction● Hashing – Application of a mathematical function (hash function) to the key values that results in mapping the possible range of key values into a smaller range of relative addresses – No obvious connection between the key and the address generated● Collision – when two or more input keys hash to the same address● a good hash function performs fast computation and produces less (or no) collisions Data Structures – Hashing 3
  • 4. Simple Hash Techniques● Prime Number Division Method h(k) = k mod n, k -integer key value and n - prime number – Division with a prime number will not result to much collision – Example: n = 13 Key Value k Hash Value h(k) Key Value k Hash Value h(k) 125 8 234 0 845 0 431 2 444 2 947 11 256 9 981 6 345 7 792 12 745 4 459 4 902 5 725 10 569 10 652 2 254 7 421 5 382 5 458 3 Data Structures – Hashing 4
  • 5. Simple Hash Techniques● Folding – Key value is split into two or more parts and then added, ANDed, or XORed to get a hash address – If the resulting address has more digits than the highest address in the file, the excess high-order digits are truncated – folding in half, folding in thirds, folding alternate digits – boundary folding, shift folding – Useful for converting keys with large number of digits to a smaller number of digits Data Structures – Hashing 5
  • 6. Collision Resolution Techniques● To avoid collision: – Spread out records - finding a hashing algorithm that distributes the records evenly – Use extra memory - wastes space – Use buckets - put more than one record at a single address● Chaining – m linked lists are maintained, one for each possible address in the hash table Data Structures – Hashing 6
  • 7. Collision Resolution Techniques● Chaining Key Value k Hash Value h(k) Key Value k Hash Value h(k) 125 8 234 0 845 0 431 2 444 2 947 11 256 9 981 6 345 7 792 12 745 4 459 4 902 5 725 10 569 10 652 2 254 7 421 5 382 5 458 3 Data Structures – Hashing 7
  • 8. Collision Resolution Techniques● Chaining KEY LINK KEY LINK 0 Λ 0 845 234 Λ 1 Λ 1 Λ 2 Λ 2 444 431 652 Λ 3 Λ 3 458 Λ 4 Λ 4 745 459 Λ 5 Λ 5 902 382 421 Λ 6 Λ 6 981 Λ 7 Λ 7 345 254 Λ 8 Λ 8 125 Λ 9 Λ 9 256 Λ 10 Λ 10 569 725 Λ 11 Λ 11 947 Λ 12 Λ 12 792 Λ Initial Table After Insertions Data Structures – Hashing 8
  • 9. Collision Resolution Techniques● Use of Buckets – Divides the hash table into m groups of records where each group contains exactly b records, having each address regarded as a bucket KEY1 KEY2 KEY3 0 845 234 1 2 444 431 652 3 458 4 745 459 5 902 382 421 6 981 7 345 254 8 125 9 256 10 569 725 11 947 12 792 Data Structures – Hashing 9
  • 10. Collision Resolution Techniques● Use of Buckets – Wastes space – Not free from further overflowing – Problem when keys are more than the static number of slots in each address Data Structures – Hashing 10
  • 11. Collision Resolution Techniques● Open Addressing (Probing) – When the address produced by a hash function h(k) has no more space for insertion, an empty slot other than h(k) in the hash table is located – Two techniques: ● Linear Probing ● Double Hashing Data Structures – Hashing 11
  • 12. Collision Resolution Techniques● Open Addressing (Probing): Linear Probing – File is scanned sequentially as a circular file – Colliding key is stored in the nearest available space to the address – Slots in a hash table may contain only one key or could be a bucket KEY1 KEY2 Key Value k Hash Value h(k) Key Value k Hash Value h(k) 0 845 234 125 8 0 1 234 444 431 2 845 0 431 2 3 652 458 444 2 947 11 4 745 459 902 382 256 9 981 6 5 981 421 345 6 7 792 12 7 345 254 745 4 459 4 8 125 902 5 725 10 9 256 569 725 569 10 652 2 10 947 11 254 7 421 5 12 792 Data Structures – Hashing 12
  • 13. Collision Resolution Techniques● Open Addressing (Probing): Double Hashing – Makes use of a second hash function, say h2(k), whenever theres a collision 1. Use the primary hash function h1(k) to determine the position i at which to place the value. 2. If there is a collision, use the rehash function rh(i, k)successively until an empty slot is found: rh(i, k) = ( i + h2(k)) mod m, m is the number of addresses Data Structures – Hashing 13
  • 14. Collision Resolution Techniques● Open Addressing (Probing): Double Hashing – Example: h2(k)= k mod 11 Key Value k Hash Value h(k) Key Value k Hash Value h(k) 125 8 234 0 845 0 431 2 444 2 947 11 256 9 981 6 345 7 792 12 745 4 459 4 902 5 725 10 569 10 652 2 254 7 421 5 382 5 458 3 Data Structures – Hashing 14
  • 15. Dynamic Files & Hashing Extendible Hashing● Uses a self-adjusting structure with unlimited bucket size● Trie: build an index based on the binary number representation of the hash value BUCKET ADDRESS A , B 10 C 11 Data Structures – Hashing 15
  • 16. Dynamic Files & Hashing Extendible Hashing● Bucket has structure (DEPTH, DATA) where DEPTH is the number of digits considered in addressing● Hash table is doubled in size every time it is expanded● Buddy buckets – two buckets that have the same depth and same initial bit strings● Collapse buddy buckets when deletion of keys leads to empty buckets Data Structures – Hashing 16
  • 17. Dynamic Files & Hashing Dynamic Hashing● Similar to extensible hashing except for the way they dynamically change the size Data Structures – Hashing 17
  • 18. Summary● Hashing maps the possible range of key values into a smaller range of relative addresses.● Simple hash techniques include the prime number division method and folding.● Chaining makes use of m linked lists, one for each possible address in the hash table.● In the use of buckets method, collision happens when a bucket overflows.● Linear probing and double hashing are two open addressing techniques.● Dynamic hash tables are useful if data to be stored is dynamic in nature. Two dynamic hashing techniques are extendible hashing and dynamic hashing. Data Structures – Hashing 18