This document describes the optimal binary search tree algorithm. It finds the structure of the optimal binary search tree (OBST) that results in the smallest possible search time for a given sequence of key accesses. It builds a BST from a set of keys with access probabilities. It calculates the costs of different subtree configurations in a bottom-up dynamic programming approach to find the overall minimum cost tree. As an example, it calculates the optimal BST for 4 keys with frequencies 3, 4, 8, 9, which has a cost of 43 and root node 2.
2. What is the problem?
• Determining the cost and the structure of the Optimal Binary Search Tree
(OBST) for a set of 4 keys to find the smallest possible search time for a
given sequence of accesses.
INDEX 0 11 2 3
NODE 43 13 80 74
FREQUENCY 3 4 8 9
3. Algorithm
• Here, the Optimal Binary Search Tree Algorithm is presented.
• First, we build a BST from a set of provided n number of distinct keys < k1,
k2, k3, ... kn >.
• Here we assume, the probability of accessing a key Ki is pi.
• Some dummy keys (d0, d1, d2, ... dn) are added as some searches may be
performed for the values which are not present in the Key set K.
• We assume, for each dummy key di probability of access is qi.
4. Algorithm
int sum(int freq[], int i, int j);
int optCost(int freq[], int i, int j)
{
if (j < i) // no elements in this subarray
return 0;
if (j == i)
return freq[i];
int fsum = sum(freq, i, j);
int min = INT_MAX;
for (int r = i; r <= j; ++r)
{
int cost = optCost(freq, i, r - 1) +
optCost(freq, r + 1, j);
if (cost < min)
min = cost;
}
return min + fsum;
}
int optimalSearchTree(int keys[],
int freq[], int n) {
return optCost(freq, 0, n - 1);
}
int sum(int freq[], int i, int j) {
int s = 0;
for (int k = i; k <= j; k++)
s += freq[k];
return s;
}
int main() {
int keys[] = {10, 12, 20};
int freq[] = {34, 8, 50};
int n = sizeof(keys) / sizeof(keys[0]);
cout << "Cost of Optimal BST is "
<< optimalSearchTree(keys, freq, n);
return 0;
}
5. Solving the Problem
• First for l = 1
calculating costs of following
Cost[0,0] = 3
Cost[1,1] = 4
Cost[2,2] = 8
Cost[3,3] = 9
• Now for l = 2
calculating cost
Cost[0,1] = 10
Cost[1,2] = 16
6. Solving the Problem
• For l = 2
calculating cost
Cost[2,3] = 25
• Now for l = 3
calculating cost
Cost[0,2] = 25
Cost[1,3] = 34
• For l = 4
calculating cost
Cost[0,3] = 43 (Maximum frequency)
7. Calculating Minimum Frequency
From the matrix 43 is the maximum value
with node (2)
Now mention second node as a root
node, with all the other nodes as the child
nodes and the following optimal binary
search tree is formed
Checking frequency as per formed obst.
Total frequency = 8x1 + 3x2 + 9x2 + 4x3
= 9 + 16 + 18 + 12
= 43
Hence Verified.