Least common ancestors in constant timePresentation Transcript
Least Common Ancestors in Constant Time
Motivation: GO Analysis on the Dendrogram
GO Analysis on Dendrogram• For each GO term • Pull out the skeleton sub- tree for this term • Test for significance only at the nodes of this skeleton tree• How does one pull out the skeleton sub-tree in time proportional to the size of that subtree?
LCA Queries• Preprocess the tree in linear time, so..• Given two nodes, the least common ancestor can be returned in O(1) time
LCA Queries on a Line Graph• Rank nodes in order from top to bottom• Take the min of the two ranks
Linearizing a Tree• Label each node with its distance from the root• Euler Tour to linearize nodes in an array (size 2n)Find the node with the least label in this range
The Range Minimum Problem• Given an array of size 2n, preprocess it in linear time so…• Given a range, the min item in that range can be returned in O(1) time
Divide and Conquer• Split into blocks at various levels of granularity – For each block, compute all prefix and suffix mins – Total space/preprocessing time O(nlog n)
Query Handling• Given the query range – Determine the granularity level at which this range straddles adjacent blocks • First bit diff in O(1) time – Look up the appropriate prefix/suffix mins in each block • Look up precomputed tables in O(1) time
Reducing Preprocessing Time• Consider blocks of size Δ=log n/3• Two blocks are said to be equivalent if all within-block queries return the same min-index for the two blocks• How many equivalence classes are +1-1+1-1-1-1-1+1-1+1 there: – Recall Euler Tour, adjacent nos differ in +/-1, so 2Δ = n1/3 – For each distinct class and each of the possible log2n/9 queries precompute answers and store – This takes time O(n1/3 log2n) = O(n)
Overall Preprocessing• Compute the O(n1/3 log2n) data structure for blocks of size =log n/3 – Within block queries can be answered using this• Create a new array of size 2n/(log n/3) = 6n/log n by replacing each block by just its minimum item• Preprocess this array as before, but now in O(n) time and space because the size of this array is just O(n/log n).