Time complexity of union find

Dec. 09, 2015
Wei Li
Zehao Cai
Ishan Sharma
Time Complexity of Union Find
1Wei/Zehao/Ishan CSCI 6212/Arora/Fall 2015

Algorithm Definition
Disjoint-set data structure is a data structure that keeps track of a
set of elements partitioned into a number of disjoint (non-overlapping)
subsets.
Union find algorithm
supports three operations on a set of elements:
• MAKE-SET(x). Create a new set containing only element x.
• FIND(x). Return a canonical element in the set containing x.
• UNION(x, y). Merge the sets containing x and y.
Implementation: Linked-list, Tree(Often)

Find(b) = c | Find(d) = f | Find(b) = f
b → h → c | d → f | b → h → c → f
Quick-find & Quick-union

Definition: The rank of a node x is similar to the height of x.
When performing the operation Union(x, y), we compare rank(x) and
rank(y):
• If rank(x) < rank(y), make y the parent of x.
• If rank(x) > rank(y), make x the parent of y.
• If rank(x) = rank(y), make y the parent of x and increase the rank of
y by one.
First Optimization: Union By Rank Heuristic
Note. In this case, rank = height.

During the execution of Find(e), e and all intermediate vertices on
the path from e to the root are made children of the root x.
Second Optimization: Path Compression

Union by Rank & Path Compression
This is why we call it “union by rank” rather than “union by height”.

Algorithms Worst-case time
Quick-find 𝑚𝑛
Quick-union 𝑚𝑛
QU + Union by Rank 𝑛 + 𝑚𝑙𝑜𝑔𝑛
QU + Path compression 𝑛 + 𝑚𝑙𝑜𝑔𝑛
QU + Union by rank + Path compression 𝒏 + 𝒎𝒍𝒐𝒈∗
𝒏
m union-find operations on a set of n objects.
Time Complexity

Lemma 1: as the find function follows the path along to
the root, the rank of node it encounters is increasing.
Union: a tree with smaller rank will be attached to a tree with greater
rank, rather than vice versa.
Find: all nodes visited along the path will be attached to the root,
which has larger rank than its children.

Lemma 2: A node u which is root of a sub-tree with
rank r has at least 2r nodes.
Proof: Initially when each node is the root of its own tree, it's trivially true.
Assume that a node u with rank r has at least 2r nodes. Then when two
tree with rank r Unions by Rank and form a tree with rank r + 1, the new
node has at least 2r + 2r = 2r + 1 nodes.

Lemma 3: The maximum number of nodes of rank r is
at most n/2r.
Proof: From lemma 2, we know that a node u which is root of a sub-tree
with rank r has at least 2r nodes. We will get the maximum number of nodes
of rank r when each node with rank r is the root of a tree that has exactly 2r
nodes. In this case, the number of nodes of rank r is n / 2r

We define “bucket” here: a bucket is a set that contains vertices with
particular ranks.
Proof

𝒍𝒐𝒈∗ 𝒏
𝑙𝑜𝑔∗
𝑛 ∶= /
0 𝑖𝑓 𝑛 ≤ 1
1 + 𝑙𝑜𝑔∗
𝑙𝑜𝑔𝑛 𝑖𝑓 𝑛 > 1
Definition: For all non-negative integer n, 𝑙𝑜𝑔∗
𝑛 is defined as
We have 𝑙𝑜𝑔∗
𝑛 ≤ 5 unless n exceeds the atoms in the universe.
𝑙𝑜𝑔∗
29
= 1 + 𝑙𝑜𝑔∗
2:
= 1
𝑙𝑜𝑔∗
16 = 𝑙𝑜𝑔∗
2<=
= 1 + 𝑙𝑜𝑔∗
2<
= 3
𝑙𝑜𝑔∗
65536 = 𝑙𝑜𝑔∗
2<==
= 1 + 𝑙𝑜𝑔∗
2<=
= 4
𝑙𝑜𝑔∗2@AAB@ = 𝑙𝑜𝑔∗2<===
= 1 + 𝑙𝑜𝑔∗2<==
= 5
𝑙𝑜𝑔∗
4 = 𝑙𝑜𝑔∗
2<
= 1 + 𝑙𝑜𝑔∗
29
= 2

We can make two observations about the buckets.
The total number of buckets is at most 𝒍𝒐𝒈∗ 𝒏.
Proof: When we go from one bucket to the next, we add one more two
to the power, that is, the next bucket to [B, 2B − 1] will be [2C
,2<E
− 1 ]
The maximum number of elements in bucket [B, 2B – 1] is at
most 𝒏.
Proof: The maximum number of elements in bucket [B, 2B – 1] is at
most 𝑛 2 𝐵⁄ + 𝑛 2CI9⁄ + 𝑛 2CI<⁄ + ⋯ + 𝑛 2<EK9
≤ 2 𝐵 − 1 − 𝐵 ∗ 𝑛/2 𝐵⁄ ≤ n
Proof

Let F represent the list of "find" operations performed, and let
Then the total cost of m finds is T = T1 + T2 + T3
Proof

Proof

T1 = constant time cost (1) per m operations: O(m)
T2 = maximum number of different buckets: O(𝑚 𝑙𝑜𝑔∗
𝑛)
T3 = for all buckets ( for all notes in one bucket)
= ∑ ∑
N
<O
<E
K9
PQC
RST∗
N
9
≤ 𝑙𝑜𝑔∗
𝑛 2C
− 1 − 𝐵
N
<E
≤ 𝑙𝑜𝑔∗ 𝑛 2C
N
<E
= 𝑛 𝑙𝑜𝑔∗
𝑛
Proof
T = T1 + T2 + T3 = O(m) + O(𝑚𝑙𝑜𝑔∗
𝑛) + O(𝑛𝑙𝑜𝑔∗
𝑛)
𝑚 ≥ 𝑛 → O(𝒎𝒍𝒐𝒈∗
𝒏)

Algorithms Worst-case time
Quick-find 𝑚𝑛
Quick-union 𝑚𝑛
QU + Union by Rank 𝑛 + 𝑚𝑙𝑜𝑔𝑛
QU + Path compression 𝑛 + 𝑚𝑙𝑜𝑔𝑛
QU + Union by rank + Path compression 𝒏 + 𝒎𝒍𝒐𝒈∗
𝒏
m union-find operations on a set of n objects.
Time Complexity

Algorithm & Time Complexity
• Simple data structure, algorithm easy to implement.
• Complex to prove time complexity. (Proved in 1975, Tarjan,
Robert Endre )
• Time complexity is near linear.
Applications
• Keep track of the connected components of an undirected
graph;
• Find minimum spanning tree of a graph.
Conclusions

https://en.wikipedia.org/wiki/Disjoint-set_data_structure
https://en.wikipedia.org/wiki/Proof_of_O(log*n)_time_complexity_of_u
nion%E2%80%93find
http://www.ccse.kfupm.edu.sa/~wasfi/Resources/ICS353CD/Lecture1
7/lec17_slide01.swf
http://sarielhp.org/teach/2004/b/webpage/lec/22_uf.pdf
https://www.cs.princeton.edu/courses/archive/spring13/cos423/lecture
s/UnionFind.pdf
References

Time complexity of union find

More Related Content

What's hot

Viewers also liked

Similar to Time complexity of union find

Recently uploaded

Time complexity of union find