Data structures and algorithms lab11

DATA STRUCTURES AND ALGORITHMS
LAB 11
Bianca Tesila
FILS, May 2014

OBJECTIVES
 Dictionaries
 Hash tables

DICTIONARIES: WHAT ARE THEY?
 An ADT made of a collection of keys and a
collection of values, in which each key has a value
associated to it
 A dictionary is also
called associative array
 Useful for searching

DICTIONARIES: OPTIMAL SEARCH
 The keys must be unique
 The range of the key must be severly bounded
Otherwise… if the keys are not unique:
 construct a set of
m(keys count) lists and
store the heads of these
lists in the associative
array(the keys)

DICTIONARIES: DUPLICATE KEYS
 If we have a high number of duplicates (a lot of elements with
the same key), the search time will severely increase
 Solution: make a function to optimize the search criterion, h
=> solve collisions of keys
 We will search for T[h(k)] rather than T[k] , where: T is our
associative array, k is an index and h(k) is a mapping function

DICTIONARIES: IMPLEMENTATION
 Hash-tables
 Self-balancing binary search trees
 Radix- tree
 Prefix-tree
 Judy arrays

DICTIONARIES: BASIC OPERATIONS
 put(key, value)
 Inserts the pair (key, value) in the hash table
 If a pair (key, value’) (with the same key) already
exists, then value’ is replaced by value
 We say that the value value is associated to the key key
 get(key)
 Returns the value associated to the key key
 If no value is associated to key, then an error occurs
 hasKey(key)
 Returns 1 if the key key exists in the hash table, and 0
otherwise

HASH-TABLES: INTRODUCTION
 Data structure with an optimized lookup function (average
search time is constant, O (1)).
 How? By turning the key in a hash (code), using a hash function
 The hash function must be wisely chosen in order to minimize the
number of collisions (Risk: different values produce the same
hashes).
 We cannot avoid all the collisions - they occur inherently as
hash length is fixed, and storage objects can have arbitrary
length and content.
 In the event of a collision, the values stored in the same position
(the same bucket). In this case, the search is reduced to
comparing the actual values in the bucket.

HASH TABLE: HASH FUNCTIONS
 Deterministic: if called twice, they should return
the same value
 Low collision rate: buckets with small dimensions
 Good dispersion between “buckets”

HASH TABLE: IMPLEMENTATION WITH LINKED LISTS
 A hash implementation which solves the collisions is
called direct chaining
 For each bucket, we use a linked list: every list is
associated to a key(hash-coded)
 Inserting in hash table means finding the correct
index(key) and adding the element to the list that
corresponds to the found key
 Deleting means searching and removing of that element
from the list

HASH TABLE: ADVANTAGES AND DISADVANTAGES
 Advantage: the delete operation is simple and
the table resizing can be postponed a
lot because (even when all positions of hash
are used), performance is still good.
 Disadvantage: for small amount of data, the
overhead is quite large and “browsing” the data can
be time consuming (the same disadvantage as in
linked lists)

HASH TABLE: EXAMPLE
• hmax is the maximum number of linked lists in our hash-table
• the function hash will be passed as an argument (actually, a
pointer to the function will be passed)
• the key is not mandatory to be a number (think of a real
dictionary!!!): that is why we use templates

HASH TABLE: ASSIGNMENT
!!Exercise: Using the previous header, implement
the hash tables data structure and test it, for a
custom hash-function

HASH TABLE: ASSIGNMENT
Hint:
 Maintain an array H[HMAX] of linked lists
 The info field of each element of a list consists of a struct containing a key
and a value
 Each key is mapped to a value hkey=hash(key), such that 0≤hkey≤HMAX-1
 hash(key) is called the hash function and hkey is the index in a linked list
 put(k, v)
 Searches for the key k in the list H[hkey=hash(k)]
 If the key is found, then we replace the value by v
 If the key is not found, then we insert the pair (k,v) in H[hkey]
 get(k)
 Search for the key k in H[hkey=hash(k)]
 If it finds the key, then it returns its associated value; otherwise, an error
occurs
 hasKey(k)
 Search for the key k in H[hkey=hash(k)]
 If it finds the key, then it returns 1; otherwise, it returns 0

Data structures and algorithms lab11

More Related Content

What's hot

Viewers also liked

Similar to Data structures and algorithms lab11

More from Bianca Teşilă

Recently uploaded

Data structures and algorithms lab11