0% found this document useful (0 votes)
71 views25 pages

CS4311 Design and Analysis of Algorithms: Lecture 8: Order Statistics

This document discusses algorithms for finding order statistics like the maximum, minimum, and kth smallest element in an unsorted array. It presents methods for finding the maximum element using either n-1 or log(n) comparisons. It also describes how to find both the maximum and minimum in 3n/2 comparisons by partitioning pairs. The document then introduces a linear-time selection algorithm that recursively partitions the array around a "median of medians" pivot to ensure subproblems are no more than 3/4 the size.

Uploaded by

Dino
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views25 pages

CS4311 Design and Analysis of Algorithms: Lecture 8: Order Statistics

This document discusses algorithms for finding order statistics like the maximum, minimum, and kth smallest element in an unsorted array. It presents methods for finding the maximum element using either n-1 or log(n) comparisons. It also describes how to find both the maximum and minimum in 3n/2 comparisons by partitioning pairs. The document then introduces a linear-time selection algorithm that recursively partitions the array around a "median of medians" pivot to ensure subproblems are no more than 3/4 the size.

Uploaded by

Dino
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

CS4311

Design and Analysis of


Algorithms
Lecture 8: Order Statistics

1
About this lecture
•Finding max, min in an unsorted array
(upper bound and lower bound)

•Selecting the kth smallest element in an


unsorted array

kth smallest element  kth order statistics

2
Finding Maximum (Method I)
• Let S denote the input set of n items
• To find the maximum of S, we can:
Step 1: Set max = item 1
Step 2: for k = 2, 3, …, n
if (item k is larger than max)
Update max = item k;
Step 3: return max;
# comparisons = n –1
3
Finding Maximum (Method II)
• Define a function Find-Max as follows:
Find-Max(R, k) /* R is a set with k items */
1. if (k 2) return maximum of R;
2. Partition items of R into bk/2c pairs;
3. Delete smaller item from R in each pair;
4. return Find-Max(R, k - bk/2c);
Calling Find-Max(S,n) gives the maximum of S

4
Finding Maximum (Method II)
Let T(n) = # comparisons for Find-Max with
problem size n

So, T(n) = T(n -bn/2c) + bn/2c for n ¸ 3


T(2) = 1
Solving the recurrence (by substitution),
we get T(n) = n - 1

5
Lower Bound
Question: Can we find the maximum using
fewer than n –1 comparisons?
Answer: No ! To ensure that an item x is
not the maximum, there must be at least
one comparison in which x is the smaller
of the compared items
So, we need to ensure n-1 items not max
 at least n –1 comparisons are needed

6
Finding Both Max and Min
Can we find both max and min quickly?
Solution 1:
First, find max with n –1 comparisons
Then, find min with n –1 comparisons
 Total = 2n –2 comparisons

Is there a better solution ??

7
Finding Both Max and Min
Solution 2: (if n is even)
First, partition items into n/2 pairs;

Next, compare items within each pair;


= larger = smaller
8
Finding Both Max and Min
Then, max = Find-Max in larger items
min = Find-Min in smaller items

Find-Max Find-Min
# comparisons = 3n/2 –2
9
Finding Both Max and Min
Solution 2: (if n is odd)
We find max and min of first n - 1 items;
if (last item is larger than max)
Update max = last item;
if (last item is smaller than min )
Update min = last item;

# comparisons = 3(n-1)/2
10
Finding Both Max and Min
Conclusion:
To find both max and min:
if n is odd: 3(n-1)/2 comparisons
if n is even: 3n/2 –2 comparisons

Combining: at most 3bn/2c comparisons

 better than finding max and min separately

11
Lower Bound
Textbook Ex 9.1-2 (Very challenging):
• Show that we need at least
d3n/2e–2 comparisons
to find both max and min in worst-case

Hint: Consider how many numbers may be


max or min (or both). Investigate how a
comparison affects these counts
12
Selection in Linear Time
• In next slides, we describe a recursive call
Select(S,k)
which supports finding the kth smallest
element in S
• Recursion is used for two purposes:
(1) selecting a good pivot (as in Quicksort)
(2) solving a smaller sub-problem

13
Select(S, k)
/* First,find a good pivot */
1. Partition S into d|S|/5e groups, each
group has five items (one group may
have fewer items);
2. Sort each group separately;
3. Collect median of each group into S’;
4. Find median m of S’
:
m = Select(S’
,dd|S|/5e/2e);

14
4. Let q = # items of S smaller than m;
5. If (k == q + 1)
return m;
/* Partition with pivot */
6. Else partition S into X and Y
X = {items smaller than m}
Y = {items larger than m}
/* Next,form a sub-problem */
7. If (k q + 1)
return Select(X, k)
8. Else
return Select(Y, k– (q+1));
15
Selection in Linear Time
Questions:

1. Why is the previous algorithm correct?


(Prove by Induction)

2. What is its running time?

16
Running Time
• In our selection algorithm, we chose m,
which is the median of medians, to be a
pivot and partition S into two sets X and Y
• In fact, if we choose any other item as the
pivot, the algorithm is still correct
• Why don’ t we just pick an arbitrary pivot
so that we can save some time ??

17
Running Time
• A closer look reviews that the worst-case
running time depends on |X| and |Y|
• Precisely, if T(|S|) denote the worst-case
running time of the algorithm on S, then

T(|S|) = T(d|S|/5e) + (|S|)

+ max {T(|X|),T(|Y|) }

18
Running Time
• Later, we show that if we choose m, the
“median of medians”, as the pivot,

both |X| and |Y| will be at most 3|S|/4

• Consequently,
T(n) = T(dn/5e) + (n) + T(3n/4)

 T(n) = (n) (obtained by substitution)

19
Median of Medians
• Let’
s begin with dn/5e sorted groups, each
has 5 items (one group may have fewer)

= larger = median = smaller

20
Median of Medians
• Then, we obtain the median of medians, m

Groups with median Groups with median


smaller than m =m larger than m 21
Median of Medians
Then, we know that all items marked with
X have value at most m
X X
X
X X X X
X X
X X
X

Groups with median


smaller than m =m X=“
value m”
22
Median of Medians
The number of items with value at most m
is at least
3(ddn/5e/2e–1) - 2

one group may have


each full group has min # of only 1 ‘
crossed’
item
3‘crossed’ items groups

 number of items: at least 3n/10 –5


23
Median of Medians
Previous page implies that at most
7n/10 + 5 items
are greater than m

 For large enough n (say, n 100)


7n/10 + 5  3n/4
 |Y| is at most 3n/4 for large enough n

24
Median of Medians
Similarly, we can show that at most
7n/10 + 5 items are smaller than m
 |X| is at most 3n/4 for large enough n

Conclusion:
The “ median of medians”helps us control
the worst-case size of the sub-problem
 without it, the algorithm runs in (n2)
time in the worst-case
25

You might also like