The document discusses algorithms for order statistics, which involve selecting the ith ranked item from an unsorted collection. It describes an algorithm that uses a median-of-medians approach to choose a pivot element in order to partition the data into subsets with expected linear time complexity. The algorithm works by dividing the input into groups of 5 elements, finding the median of each group, and using the median-of-medians as the pivot. It then partitions around this pivot and recursively processes only one of the resulting subsets. The document analyzes the time complexity of this algorithm and proves it runs in O(n) time.
3. 3
Your To-Do List
• Read [CLRS] 9.
• Assignment 3.
school.edhole.com
4. 4
What are Order Statistics?
Selecting ith-ranked item from a collection.
– First: i = 1
– Last: i = n
ù
, é
n
2
ê
– Median(s): i = úú
êê
úûú
êë
2
n
school.edhole.com
5. 5
Order Statistics Overview
Assume collection is unordered, otherwise trivial.
Can sort first – O(n lg n), but can do better – Q(n).
school.edhole.com
6. 6
Order Statistics Overview
Algorithms for i=1 & i=n are easy.
What are they?
? ?
Scan data, keeping track of smallest/largest school.edhole.com element seen so far.
7. 7
Order Statistics Overview
How can we modify Quicksort to obtain
expected-case Q(n)?
?
?
Pivot, partition, but recur only on one set school.edhole.com of data. No join.
8. 8
Order Statistics
We’ll use this idea.
But, by guaranteeing a good split, can get worst-case
Q(n).
Warning: Non-obvious & unintuitive
algorithm ahead!
Blum, Floyd, Pratt, Rivest, Tarjan (1973)
school.edhole.com
9. 9
Order Statistics: Algorithm
Select(A,n,i):
Divide input into groups of size 5.
/* Partition on median-of-medians */
medians = array of each group’s median.
pivot = Select(medians, , )
L,G = partition(A, pivot)
/* Find ith element in L, pivot, or G */
k = # of lesser elements + 1
If i=k, return pivot
If i<k, return Select(L, k-1, i)
If i>k, return Select(G, n-k, i-k)
T(n)
O(n)
O(n)
O(n)
O(1)
O(1)
T(k-1)
T(n-k)
All this
to find a
good split.
Only one
done.
én/5ù
én/5ù én/10ù T( é n / 5 ù )
school.edhole.com
10. 10
Order Statistics: Analysis
T n = T æ
é
n ö
÷ + + ÷ø( ) T(max(k -1,n-k)) O(n)
5
ç çè
úúù
êê
#less #greater
How to simplify?
school.edhole.com
11. 11
Order Statistics: Analysis
Lesser
Elements
Median
Greater
Elements
One group of 5 elements.
school.edhole.com
12. 12
Order Statistics: Analysis
Median of
Medians
Greater
Medians
Lesser
Medians
All groups of 5 elements.
(And at most one smaller group.)
school.edhole.com
13. 13
Order Statistics: Analysis
Definitely Lesser
Elements
Definitely Greater
school.edhole.com Elements
14. 14
Order Statistics: Analysis 1
Must recur on all elements outside one of these boxes.
school.edhole.com How many?
15. 15
Order Statistics: Analysis 1
ëén 5ù 2û full groups of 5 één 5ù 2ù partial groups of 2
At most
5 n 2 2 n 2 7n 7
êé ù ú éé ù ù êëêê úú úû + êêêê úú úú
£ + 5 5 10
Count elements
scouhtsoidoel sm.aellderh booxl.e.com
16. 16
Order Statistics: Analysis 2
Equivalently, must recur on all elements not inside one of these boxes.
school.edhole.com How many?
17. 17
Order Statistics: Analysis 2
Count elements in ³1
smaller box & pivot.
éêéên 5ùú 2ùú -1 groups of 3
At most
æ æ éé ù ù ö ö çç ç êê ú ú ¸ ¸¸£ è è êê ú ú ø ø
n- 3 n 2 -1 +1 7n +2
5 10
18. 18
Order Statistics: Analysis
T n T n T 7n 2 O n
( ) = æçéê ùúö¸+ æç + ¸ö+ ( ) èê 5 úø è 10
ø
A very unusual recurrence. How to solve?
? ?
school.edhole.com
19. 19
Order Statistics: Analysis
$ ( ) £ ´ " ³ 0 0 Substitution: Prove c,n > 0, T n c n, n n
T n c n c 7n 2 kn
( ) £ éê ùú+ æç + ö¸+ ê 5 ú è 10
ø
c n 1 c 7n 2 kn
£ æ æ çè + ø¸ ö+ èç + ø¸
ö+ 5 10
Overestimate ceiling
= 9 cn+3c +kn
10
Algebra
£c´n
0 £ 1 n-3c -kn
when 10
" c,k, can find a n0 such that this holds "n≥n0.
school.edhole.com
20. 20
Order Statistics
Why groups of 5?
? ?
Sum of two recurrence sizes must be < 1.
Grouping by 5 is smallest size that works.
school.edhole.com