Algorithm chapter 7
Upcoming SlideShare
Loading in...5
×
 

Algorithm chapter 7

on

  • 791 views

 

Statistics

Views

Total Views
791
Views on SlideShare
791
Embed Views
0

Actions

Likes
0
Downloads
13
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Algorithm chapter 7 Algorithm chapter 7 Presentation Transcript

  • Space and Time Tradeoffs (Hashing) 1
  • Space and Time Tradeoffs Space and Time tradeoffs in algorithm design are a well-known issue . Example: computing values of a function at many points. One type of technique is to use extra space to facilitate faster and/or more flexible access to the data. This approach is called prestructuring. We illustrate this approach by Hashing. 2
  • Hashing A dictionary is a set that supports operations of searching, insertion, and deletion. Each element in the set contains a key and satellite data (the remainder of the record.) The keys are unique, but the satellite data are not. A hash table is an effective data structure for implementing dictionaries. Hashing is based on the idea of distributing keys among an one-dimensional array. 3
  • Direct-address Tables Suppose that an application needs a dynamic set in which each element has a key drawn from the Universe U = {0, 1, …, m-1}, where m is not too large. Denote direct-address table by T[0..m-1], in which each position, or slot, corresponds to a key in the universe U. Operations DIRECT-ADDRESS-SEARCH(T, k) O(1) Return T[k] DIRECT-ADDRESS-INSERT(T, x) O(1) T[key[x]] x DIRECT-ADDRESS-DELETE(T, x) O(1) T[key[x]] NIL 4
  • Hash TablesA hash table is used when the set K of keys stored indictionary is much smaller than the universe U = {0,1, …, n-1}, of all possible Keys. An example, the key space of strings of characters. Requires much less storage while search cost is still O(1).An example of hash tableDirect addressing vs. Hashing Direct addressing: an element with key k is stored in slot k; Hashing: an element with k is stored in slot h(k), where h(k) is the hash function. 5
  • Hash TablesHash function assigns an integer between 0 and m-1,called hash address, to a key. An example hash function: h(K) = K mod m Integer keys (example) Character keys: ord(K), the position of the key in the alphabet. Character string keys: s −1 (∑ ord (c j )) mod m i =0 ( ord(c s-1) Cs-1 + ord(c s-2) Cs-2 + … + ord(c 0) C0 ) mod m Let m = 13, calculate the hash address of the following strings A, FOOL, AND, HIS, MONEY, ARE, SOON, PARTED 6
  • Hash Function A hash function needs to satisfy two requirements: Needs to distribute keys among the cells of the hash table as evenly as possible. (m is usually chosen to be prime) Has to be easy to compute. 7
  • Collision and Resolution Collision: two keys hash to the same slot. Collision resolution by open hashing (separate chaining) Collision resolution by closed hashing (open addressing) 8
  • Open Hashing (Separate Chaining) Put all the elements that hash to the same slot in a linked list. Example Dictionary Operations CHAINED-HASH-SEARCH(T, k) search for an element with key k in list T[h(k)] CHAINED-HASH-INSERT(T, x) O(1) insert x at the head of list T[h(key[x])] CHAINED-HASH-DELETE(T, x) search and delete x from the list T[h(key[x])]Exercise 9
  • Cost of Search Load factor of the hash table α = n/m, where n is the number of keys and m is the number of slots in the hash table. Too small: waste of space but fast in search Too large: save space but slow in search The worst case O(n): all keys hash to the same slot The average case Average cost of a successful search: O(1 + α / 2) Average cost of an unsuccessful search: O(α) If n is about equal to m, O(1) 10
  • Closed Hashing (Open Address Hashing) Open address hashing a strategy for storing all elements right in the array of the hash table, rather than using linked lists to accommodate collisions. Assumption: (m >=n) The idea is that if the hash slot for a certain key is occupied by a different element, then a sequence of alternative locations for the current element is defined. For every key k, a probe sequence <h(k, 0), h(k, 1), …, h(k, m-1)> is generated so that when a collision occurs, we successively examine, or probe the hash table until we find an empty slot in which to put the key.. Probing policies Linear probing Quadratic probing Double hashing 11
  • Linear Probing Given an ordinary hash function: h’, an auxiliary hash function, the method of linear probing uses the hash function h(k, i) = (h’(k) + i) mod m, for i = 0, 1, …, m-1. Search Compare the given key with the key in the probed position until either the key is found or an empty slot is encountered. An example The problem with deletion and the solution Lazy deletion: mark the previously occupied locations as “obsolete” to distinguish them from locations that have not been occupied. Advantage & Disadvantage: Easy to implement but when the load factor approaches 1, it suffers from clustering: Long runs of occupied slots build up, increasing the average search time. Exercise 12
  • Quadratic Probing Given an ordinary hash function: h’, an auxiliary hash function, the method of quadratic probing uses the hash function h(k, i) = (h’(k) + c1i + c2i2) mod m, where i = 0, 1, …, m-1, c1 and c2 ‡ 0. Advantage & Disadvantage: Easy to implement It suffers from a milder form clustering: If two keys have the same initial probe position, then their probe sequences are the same. 13
  • Double Hashing Given two auxiliary hash functions: h1 and h2, double hashing uses the hash function h(k, i) = (h1(k) + ih2(k)) mod m, where i = 0, 1, …, m-1. An example One of the best methods available for open addressing. 14