0% found this document useful (0 votes)
27 views109 pages

Bhaichara

This document discusses the course Design and Analysis of Algorithms. It covers key topics like algorithms, analyzing algorithm complexity, sorting algorithms, and asymptotic analysis. The goal of analysis of algorithms is to compare how running time increases with input size. Running time is expressed as a function of input size n. Asymptotic analysis compares these functions to determine algorithms' rates of growth for large n. Common notations like O, Ω, Θ are used to classify algorithms by their asymptotic running time.

Uploaded by

picsichub
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views109 pages

Bhaichara

This document discusses the course Design and Analysis of Algorithms. It covers key topics like algorithms, analyzing algorithm complexity, sorting algorithms, and asymptotic analysis. The goal of analysis of algorithms is to compare how running time increases with input size. Running time is expressed as a function of input size n. Asymptotic analysis compares these functions to determine algorithms' rates of growth for large n. Common notations like O, Ω, Θ are used to classify algorithms by their asymptotic running time.

Uploaded by

picsichub
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 109

Design And Analysis of Algorithm

• KCS-503
• Internal Marks: 50
• External marks: 100

1
Unit-1
• Introduction
• Algorithms
• Analyzing Algorithms
• Complexity of Algorithms
• Growth of Functions
• Performance Measurements
• Sorting And Order Statistics
• Shell Sort, Quick Sort, Merge Sort, Heap Sort
• Comparison of sorting algorithms
• Sorting in Linear Time
2
Algorithm
• An algorithm is a set of rules that must be followed when
solving a specific problem. An algorithm may be defined
as a finite sequence of instructions which has the
following five basic properties.
1. Input: A number of quantities are provided to an
algorithm initially before the algorithm begins. These
quantities are the inputs which are processed by the
algorithm.
2. Definiteness: the processing rules specified in the
algorithm must be precise and unambiguous.
3. Effectiveness: Each instruction must be basic such that
it can be carried out in a finite time.

3
Algorithm…..
4. Finiteness: The total time to carry out all the
steps in the algorithm must be finite.
5. Output: An algorithm must have one or more
output
definiteness

input effectiveness

Algorithm

output finiteness

4
Pseudo code Convention
We can describe an algorithm in many ways such as in natural languages like
English. But in natural languages we must sure that the resulting instructions
are definite.
We may use graphical representation but they work well only if algorithm is
small and simple.
So there is need of convention which should be language and machine
independent and that is pseudo code convention.
It is easy to understand an algorithm by using pseudo code
Ex. Sum(A,N) Cost No of Times
1.s=0 c1 one
2.For i=1 to N c2 N+1
3. Do s= s+ A[i] c3 N
4.Return s c4 one

Total time= c1*1 + c2*(N+1) + c3*N + c4*1= c1 +c2N + c2 +c3N + c4


(c2+c3)N + c1+ c2 + c4 = an + b
5
Analysis of Algorithms
• An algorithm is a finite set of precise instructions
for performing a computation or for solving a
problem.
• What is the goal of analysis of algorithms?
– To compare algorithms mainly in terms of running
time but also in terms of other factors (e.g., memory
requirements, programmer's effort etc.)
• What do we mean by running time analysis?
– Determine how running time increases as the size
of the problem increases.

6
Types of Analysis
• Worst case
– Provides an upper bound on running time
– An absolute guarantee that the algorithm would not run longer,
no matter what the inputs are
• Best case
– Provides a lower bound on running time
– Input is the one for which the algorithm runs the fastest

Lower Bound  Running Time  Upper Bound


• Average case
– Provides a prediction about the running time
– Assumes that the input is random
7
How do we compare algorithms?
• We need to define a number of objective
measures.

(1) Compare execution times?


Not good: times are specific to a particular
computer !!

(2) Count the number of statements executed?


Not good: number of statements vary with
the programming language as well as the
style of the individual programmer.
8
Ideal Solution

• Express running time as a function of the


input size n (i.e., f(n)).
• Compare different functions corresponding
to running times.
• Such an analysis is independent of
machine time, programming style, etc.

9
Example
• Associate a "cost" with each statement.
• Find the "total cost“ by finding the total number of times
each statement is executed.
Algorithm 1 Algorithm 2

Cost Cost
arr[0] = 0; c1 for(i=0; i<N; i++) c2
arr[1] = 0; c1 arr[i] = 0; c1
arr[2] = 0; c1
... ...
arr[N-1] = 0; c1
----------- -------------
c1+c1+...+c1 = c1 x N (N+1) x c2 + N x c1 =
(c2 + c1) x N + c2

10
Asymptotic Analysis
• To compare two algorithms with running
times f(n) and g(n), we need a rough
measure that characterizes how fast
each function grows.
• Compare functions in the limit, that is,
asymptotically!
(i.e., for large values of n)

11
Rate of Growth
• Consider the example of buying elephants and
goldfish:
Cost: cost_of_elephants + cost_of_goldfish
Cost ~ cost_of_elephants (approximation)
• The low order terms in a function are relatively
insignificant for large n
n4 + 100n2 + 10n + 50 ~ n4

i.e., we say that n4 + 100n2 + 10n + 50 and n4


have the same rate of growth

12
Asymptotic Notation

• O notation: asymptotic “less than”:

– f(n)=O(g(n)) implies: f(n) “≤” g(n)

•  notation: asymptotic “greater than”:

– f(n)=  (g(n)) implies: f(n) “≥” g(n)

•  notation: asymptotic “equality”:

– f(n)=  (g(n)) implies: f(n) “=” g(n)

13
Back to Our Example
Algorithm 1 Algorithm 2
Cost Cost
arr[0] = 0; c1 for(i=0; i<N; i++) c2
arr[1] = 0; c1 arr[i] = 0; c1
arr[2] = 0; c1
...
arr[N-1] = 0; c1
----------- -------------
c1+c1+...+c1 = c1 x N (N+1) x c2 + N x c1 =
(c2 + c1) x N + c2

• Both algorithms are of the same order: O(N)

14
Asymptotic notations
• O-notation

15
Big-O Visualization

O(g(n)) is the set of


functions with smaller
or same order of
growth as g(n)

16
Examples

– n2 = O(n2):
n2 ≤ cn2  c ≥ 1  c = 1 and n0= 1

– 1000n2+1000n = O(n2):
1000n2+1000n ≤ 1000n2+ n2 =1001n2 c=1001 and n0 = 1000

– n = O(n2):
n ≤ cn2  cn ≥ 1  c = 1 and n0= 1

17
No Uniqueness
• There is no unique set of values for n0 and c in proving the
asymptotic bounds

• Prove that 100n + 5 = O(n2)


– 100n + 5 ≤ 100n + n = 101n ≤ 101n2

for all n ≥ 5

n0 = 5 and c = 101 is a solution

– 100n + 5 ≤ 100n + 5n = 105n ≤ 105n2


for all n ≥ 1

n0 = 1 and c = 105 is also a solution


Must find SOME constants c and n0 that satisfy the asymptotic notation relation
18
Asymptotic notations (cont.)
•  - notation

(g(n)) is the set of functions


with larger or same order of
growth as g(n)

19
Examples
– 5n2 = (n)
 c, n0 such that: 0  cn  5n2  cn  5n2  c = 1 and n0 = 1

– 100n + 5 ≠ (n2)
 c, n0 such that: 0  cn2  100n + 5
100n + 5  100n + 5n ( n  1) = 105n
cn2  105n  n(cn – 105)  0
Since n is positive  cn – 105  0  n  105/c
 contradiction: n cannot be smaller than a constant
– n = (2n), n3 = (n2), n = (logn)
20
Asymptotic notations (cont.)
• -notation

(g(n)) is the set of functions


with the same order of growth
as g(n)

21
Examples
– n2/2 –n/2 = (n2)

• ½ n2 - ½ n ≤ ½ n2 n ≥ 0  c2= ½

• ½ n2 - ½ n ≥ ½ n2 - ½ n * ½ n ( n ≥ 2 ) = ¼ n2

 c1= ¼

– n ≠ (n2): c1 n2 ≤ n ≤ c2 n2

 only holds for: n ≤ 1/c1

22
Examples
– 6n3 ≠ (n2): c1 n2 ≤ 6n3 ≤ c2 n2

 only holds for: n ≤ c2 /6

– n ≠ (logn): c1 logn ≤ n ≤ c2 logn

 c2 ≥ n/logn,  n≥ n0 – impossible

23
Common orders of magnitude

24
Logarithms and properties
• In algorithm analysis we often use the notation “log n”
without specifying the base

Binary logarithm lg n = log 2 n log x y = y log x


Natural logarithm ln n = log e n log xy = log x + log y
x
lg k n = (lg n )k log = log x − log y
y
lg lg n = lg(lg n )
a logb x = x b
log a

log b x = log a x
log a b

25
Properties
• Theorem:
f(n) = (g(n))  f = O(g(n)) and f = (g(n))
• Transitivity:
– f(n) = (g(n)) and g(n) = (h(n))  f(n) = (h(n))
– Same for O and 
• Reflexivity:
– f(n) = (f(n))
– Same for O and 
• Symmetry:
– f(n) = (g(n)) if and only if g(n) = (f(n))
• Transpose symmetry:
– f(n) = O(g(n)) if and only if g(n) = (f(n))
26
Common Summations
n
n ( n + 1)
• Arithmetic series:  k = 1 + 2 + ... + n =
k =1 2
n
x n +1 − 1
• Geometric series: 
k =0
x = 1 + x + x + ... + x =
k 2 n

x −1
(x  1)

1
– Special case: |x| < 1: x
k =0
k
=
1− x
n
1 1 1
• Harmonic series: 
k =1 k
= 1 +
2
+ ... +
n
 ln n

n
• Other important formulas:  lg k  n lg n
k =1

n
1
 k p = 1p + 2 p + ... + n p 
k =1 p +1
n p +1

27
Recurrences and Running Time
• An equation or inequality that describes a function in
terms of its value on smaller inputs.
T(n) = T(n-1) + n
• Recurrences arise when an algorithm contains recursive
calls to itself

• What is the actual running time of the algorithm?


• Need to solve the recurrence
– Find an explicit formula of the expression
– Bound the recurrence by an expression that involves n

28
Example Recurrences
• T(n) = T(n-1) + n Θ(n2)
– Recursive algorithm that loops through the input to
eliminate one item
• T(n) = T(n/2) + c Θ(lgn)
– Recursive algorithm that halves the input in one step
• T(n) = T(n/2) + n Θ(n)
– Recursive algorithm that halves the input but must
examine every item in the input
• T(n) = 2T(n/2) + 1 Θ(n)
– Recursive algorithm that splits the input into 2 halves
and does a constant amount of other work
29
Methods for Solving Recurrences

• Iteration method

• Substitution method

• Recursion tree method

• Master method

30
The Iteration Method
• Convert the recurrence into a summation and try
to bound it using known series
– Iterate the recurrence until the initial condition is
reached.
– Use back-substitution to express the recurrence in
terms of n and the initial (boundary) condition.

31
The Iteration Method
T(n) = c + T(n/2)
T(n) = c + T(n/2) T(n/2) = c + T(n/4)
= c + c + T(n/4) T(n/4) = c + T(n/8)
= c + c + c + T(n/8)
Assume n = 2k
T(n) = c + c + … + c + T(1)
k times
= clgn + T(1)
= Θ(lgn)

32
The substitution method

1. Guess a solution

2. Use induction to prove that the


solution works

33
Substitution method
• Guess a solution
– T(n) = O(g(n))

– Induction goal: apply the definition of the asymptotic notation

• T(n) ≤ d g(n), for some d > 0 and n ≥ n0

– Induction hypothesis: T(k) ≤ d g(k) for all k < n (strong induction)

• Prove the induction goal


– Use the induction hypothesis to find some values of the
constants d and n0 for which the induction goal holds

34
Example1
T(n) = c + T(n/2)
• Guess: T(n) = O(lgn)
– Induction goal: T(n) ≤ d lgn, for some d and n ≥ n0
– Induction hypothesis: T(n/2) ≤ d lg(n/2)

• Proof of induction goal:


T(n) = T(n/2) + c ≤ d lg(n/2) + c
= d lgn – d + c ≤ d lgn
if: – d + c ≤ 0, d ≥ c

35
The recursion-tree method

Convert the recurrence into a tree:


– Each node represents the cost incurred at various
levels of recursion

– Sum up the costs of all levels

36
Example 1
W(n) = 2W(n/2) + n2

• Subproblem size at level i is: n/2i


• Subproblem size hits 1 when 1 = n/2i  i = lgn
• Cost of the problem at level i = (n/2i)2 No. of nodes at level i = 2i
• Total cost: lg n −1 2
n lg n −1
 1 
i 
 1 
i
1
W ( n) = 2
i =0
lg n
i
+ 2 W (1) = n
2
  2 
2

i =0
+nn
2
  2 
i =0
+ O(n) =n
1− 1
+ O ( n) = 2n 2
2
 W(n) = O(n2)
37
Problem 1

W(n) = W(n/3) + W(2n/3) + n

38
Solution Prob1
W(n) = W(n/3) + W(2n/3) + n

• The longest path from the root to


a leaf is:
n → (2/3)n → (2/3)2 n → … → 1

• Subproblem size hits 1 when


1 = (2/3)in  i=log3/2n

• Cost of the problem at level i = n

• Total cost:
lg n
W (n)  n + n + ... = n(log 3/ 2 n) = n = O(n lg n)
3
lg
2
 W(n) = O(nlgn) 39
Master’s method
• “Cookbook” for solving recurrences of the form:
n
T (n) = aT   + f (n)
b
where, a ≥ 1, b > 1, and f(n) > 0

Idea: compare f(n) with nlogba

• f(n) is asymptotically smaller or larger than nlogba by


a polynomial factor n

• f(n) is asymptotically equal with nlogba


40
Master’s method
• “Cookbook” for solving recurrences of the form:

n
T (n) = aT   + f (n)
b
where, a ≥ 1, b > 1, and f(n) > 0

Case 1: if f(n) = O(nlogba -) for some  > 0, then: T(n) = (nlogba)

Case 2: if f(n) = (nlogba), then: T(n) = (nlogba lgn)

Case 3: if f(n) = (nlogba +) for some  > 0, and if

af(n/b) ≤ cf(n) for some c < 1 and all sufficiently large n, then:

T(n) = (f(n))
regularity condition
41
Example1

T(n) = 2T(n/2) + n

a = 2, b = 2, log22 = 1

Compare nlog22 with f(n) = n

 f(n) = (n)  Case 2

 T(n) = (nlgn)

42
Example2
T(n) = 2T(n/2) + n2
a = 2, b = 2, log22 = 1
Compare n with f(n) = n2
 f(n) = (n1+) Case 3  verify regularity cond.
a f(n/b) ≤ c f(n)
 2 n2/4 ≤ c n2  c = ½ is a solution (c<1)
 T(n) = (n2)

43
Example3

T(n) = 2T(n/2) + n

a = 2, b = 2, log22 = 1

Compare n with f(n) = n1/2

 f(n) = O(n1-) Case 1

 T(n) = (n)

44
The Sorting Problem

• Input:

– A sequence of n numbers a1, a2, . . . , an

• Output:

– A permutation (reordering) a1’, a2’, . . . , an’ of the input

sequence such that a1’ ≤ a2’ ≤ · · · ≤ an’

45
Insertion Sort
• Idea: like sorting a hand of playing cards
– Start with an empty left hand and the cards facing
down on the table.
– Remove one card at a time from the table, and insert
it into the correct position in the left hand
• compare it with each of the cards already in the hand, from
right to left
– The cards held in the left hand are sorted

46
Insertion Sort

To insert 12, we need to


make room for it by moving
first 36 and then 24.

47
Insertion Sort

48
Insertion Sort

49
Insertion Sort

input array

5 2 4 6 1 3

at each iteration, the array is divided in two sub-arrays:

left sub-array right sub-array

sorted unsorted

50
Insertion Sort

51
INSERTION-SORT
Alg.: INSERTION-SORT(A) 1 2 3 4 5 6 7 8

for j ← 2 to n a1 a2 a3 a4 a5 a6 a7 a8

do key ← A[ j ] key
Insert A[ j ] into the sorted sequence A[1 . . j -1]
i←j-1
while i > 0 and A[i] > key
do A[i + 1] ← A[i]
i←i–1
A[i + 1] ← key
• Insertion sort – sorts the elements in place
52
Analysis of Insertion Sort
INSERTION-SORT(A) cost times
for j ← 2 to n c1 n
do key ← A[ j ] c2 n-1
Insert A[ j ] into the sorted sequence A[1 . . j -1] 0 n-1
i←j-1 c4 n-1

n
while i > 0 and A[i] > key c5 j =2 j
t
c6 
n
do A[i + 1] ← A[i] j =2
(t j − 1)
c7 
n
i←i–1 j =2
(t j − 1)
A[i + 1] ← key c8 n-1
tj: # of times the while statement is executed at iteration j

T (n) = c1n + c2 (n − 1) + c4 (n − 1) + c5  t j + c6  (t j − 1) + c7  (t j − 1) + c8 (n − 1)
n n n

j =2 j =2 j =2
53
Best Case Analysis
• The array is already sorted “while i > 0 and A[i] > key”
– A[i] ≤ key upon the first time the while loop test is run
(when i = j -1)

– tj = 1

• T(n) = c1n + c2(n -1) + c4(n -1) + c5(n -1) + c8(n-1)


= (c1 + c2 + c4 + c5 + c8)n + (c2 + c4 + c5 + c8)

= an + b = (n)
T (n) = c1n + c2 (n − 1) + c4 (n − 1) + c5  t j + c6  (t j − 1) + c7  (t j − 1) + c8 (n − 1)
n n n

j =2 j =2 j =2
54
Worst Case Analysis
• The array is in reverse sorted order“while i > 0 and A[i] > key”
– Always A[i] > key in while loop test
– Have to compare key with all elements to the left of the j-th
position  compare with j-1 elements  tj = j
n
n(n + 1) n
n(n + 1) n
n(n − 1)
using 
j =1
j=
2
=  j =
j =2 2
− 1 =  ( j −1) =
j =2 2
we have:

 n( n + 1)  n( n − 1) n( n − 1)
T ( n ) = c1n + c2 ( n − 1) + c4 ( n − 1) + c5  − 1 + c6 + c7 + c8 ( n − 1)
 2  2 2

= an 2 + bn + c a quadratic function of n

• T(n) = (n2) order of growth in n2


T (n) = c1n + c2 (n − 1) + c4 (n − 1) + c5  t j + c6  (t j − 1) + c7  (t j − 1) + c8 (n − 1)
n n n

j =2 j =2 j =2 55
Merge Sort Approach
• To sort an array A[p . . r]:
• Divide
– Divide the n-element sequence to be sorted into two
subsequences of n/2 elements each
• Conquer
– Sort the subsequences recursively using merge sort
– When the size of the sequences is 1 there is nothing
more to do
• Combine
– Merge the two sorted subsequences

56
Merge Sort
p q r
1 2 3 4 5 6 7 8

Alg.: MERGE-SORT(A, p, r) 5 2 4 7 1 3 2 6

if p < r Check for base case

then q ← (p + r)/2 Divide

MERGE-SORT(A, p, q) Conquer

MERGE-SORT(A, q + 1, r) Conquer

MERGE(A, p, q, r) Combine

• Initial call: MERGE-SORT(A, 1, n)

57
Example – n Power of 2
1 2 3 4 5 6 7 8

Divide 5 2 4 7 1 3 2 6 q=4

1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6

1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6

1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6

58
Example – n Power of 2
1 2 3 4 5 6 7 8

Conquer 1 2 2 3 4 5 6 7
and
Merge 1 2 3 4 5 6 7 8

2 4 5 7 1 2 3 6

1 2 3 4 5 6 7 8

2 5 4 7 1 3 2 6

1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6

59
Example – n Not a Power of 2
1 2 3 4 5 6 7 8 9 10 11

4 7 2 6 1 4 7 3 5 2 6 q=6
Divide
1 2 3 4 5 6 7 8 9 10 11

q=3 4 7 2 6 1 4 7 3 5 2 6 q=9

1 2 3 4 5 6 7 8 9 10 11

4 7 2 6 1 4 7 3 5 2 6

1 2 3 4 5 6 7 8 9 10 11

4 7 2 6 1 4 7 3 5 2 6

1 2 4 5 7 8

4 7 6 1 7 3

60
Example – n Not a Power of 2
1 2 3 4 5 6 7 8 9 10 11

Conquer 1 2 2 3 4 4 5 6 6 7 7
and
1 2 3 4 5 6 7 8 9 10 11
Merge 1 2 4 4 6 7 2 3 5 6 7

1 2 3 4 5 6 7 8 9 10 11

2 4 7 1 4 6 3 5 7 2 6

1 2 3 4 5 6 7 8 9 10 11

4 7 2 1 6 4 3 7 5 2 6

1 2 4 5 7 8

4 7 6 1 7 3

61
Merge - Pseudocode
p q r
Alg.: MERGE(A, p, q, r) 1 2 3 4 5 6 7 8

2 4 5 7 1 2 3 6
1. Compute n1 and n2
2. Copy the first n1 elements into n1 n2
L[1 . . n1 + 1] and the next n2 elements into R[1 . . n2 + 1]
3. L[n1 + 1] ← ; R[n2 + 1] ←  p q

4. i ← 1; j ← 1 L 2 4 5 7 
5. for k ← p to r q+1 r

6. do if L[ i ] ≤ R[ j ] R 1 2 3 6 
7. then A[k] ← L[ i ]
8. i ←i + 1
9. else A[k] ← R[ j ]
10. j←j+1
62
MERGE-SORT Running Time
• Divide:
– compute q as the average of p and r: D(n) = (1)
• Conquer:
– recursively solve 2 subproblems, each of size n/2
 2T (n/2)
• Combine:
– MERGE on an n-element subarray takes (n) time
 C(n) = (n)
(1) if n =1
T(n) = 2T(n/2) + (n) if n > 1

63
Solve the Recurrence
T(n) = c if n = 1
2T(n/2) + cn if n > 1

Use Master’s Theorem:

Compare n with f(n) = cn


Case 2: T(n) = Θ(nlgn)

64
Quicksort
• Sort an array A[p…r]
• Divide
– Partition the array A into 2 sub arrays A[p..q-1] and A[q+1..r],
such that each element of A[p..q-1] is smaller than or equal to
A[q] and each element of A[q+1..r] is greater than A[q].
– Need to find index q to partition the array

65
QUICKSORT

Alg.: QUICKSORT(A, p, r) Initially: p=1, r=n

if p < r

then q  PARTITION(A, p, r)

QUICKSORT (A, p, q-1)

QUICKSORT (A, q+1, r)

66
Partitioning the Array
Alg. PARTITION (A, p, r)
1. x  A[r]
2. i  p – 1
3. for j= p to r-1
4. do if A[j] ≤ x
5. then i  i + 1
6. A[i]  A[j]
7. exchange A[i+1]  A[r]
8. return i+1

67
Example
• p r
• initially: 2 8 7 1 3 5 6 4 note: pivot (x) = A[r]=4
• i j
• next iteration: 2 8 7 1 3 5 6 4
• i j
• next iteration: 2 8 7 1 3 5 6 4
• next iteration: 2 8 7 1 3 5 6 4
• i j
• next iteration: 2 1 7 8 3 5 6 4
• i j
• next iteration: 2 1 3 8 7 5 6 4
• i
• after final swap : 2 1 3 4 7 5 6 8
Analysis
(1) if n =1
T(n) = T(n1) +T(n2) + (n) if n > 1

• T(n1) = time for left sub array


• T(n2)= time for right sub array
• (n)= for partition

69
Worst Case
• n1=0
• n2=n-1
• T(n)= (n) + T(0) + T(n-1)
T(n)= cn + T(n-1)
• Can be solved by iteration method
• T(n)= (n2)

70
Best Case
• T(n)= 2T(n/2) + (n)
can be solved by master method
T(n)= Θ(nlgn)

71
Case Between Worst and Best
• In average case analysis the array is partitioned by choosing any random
number. Let us assume partition of array to be 9:1.
T(n) = T(9n/10) + T(n/10) + n

72
The Heap Data Structure
• Def: A heap is a nearly complete binary tree with
the following two properties:
– Structural property: all levels are full, except
possibly the last one, which is filled from left to right
– Order (heap) property: for any node x
Parent(x) ≥ x

8 From the heap property, it


follows that:
7 4 “The root is the maximum
5 2 element of the heap!”
Heap
73
A heap is a binary tree that is filled in order
Heap Types
• Max-heaps (largest element at root), have the
max-heap property:
– for all nodes i, excluding the root:
A[PARENT(i)] ≥ A[i]

• Min-heaps (smallest element at root), have the


min-heap property:
– for all nodes i, excluding the root:
A[PARENT(i)] ≤ A[i]

74
Heapsort
• Goal:
– Sort an array using heap representations

• Idea:
– Build a max-heap from the array
– Swap the root (the maximum element) with the last
element in the array
– “Discard” this last node by decreasing the heap size
– Call MAX-HEAPIFY on the new root
– Repeat this process until only one node remains

75
Alg: HEAPSORT(A)

1. BUILD-MAX-HEAP(A)
2. for i ← length[A] to 2
3. do exchange A[1] ↔ A[i]
4. heap size ← heapsize-1
5. MAX-HEAPIFY(A, 1)

76
Building a Heap

Alg: BUILD-MAX-HEAP(A)
1. n = length[A]
2. for i ← n/2 downto 1
3. do MAX-HEAPIFY(A, i)

77
Maintaining the Heap Property
Alg: MAX-HEAPIFY(A, i)
1. l ← LEFT(i)
2. r ← RIGHT(i)
3. if l ≤ heap-size and A[l] > A[i]
4. then largest ←l
5. else largest ←i
6. if r ≤ heap-size and A[r] > A[largest]
7. then largest ←r
8. if largest  i
9. then exchange A[i] ↔ A[largest]
10. MAX-HEAPIFY(A, largest)

78
Example: A 4 1 3 2 16 9 10 14 8 7

i=5 i=4 i=3


1 1 1

4 4 4
2 3 2 3 2 3

1 3 1 3 1 3
4 5 6 7 4 5 6 7 4 5 6 7

8
2 9 10
16 9 10 8 2 9 10
16 9 10 8 14 9 10
16 9 10
14 8 7 14 8 7 2 8 7

i=2 i=1
1 1 1

4 4 16
2 3 2 3 2 3

1 10 16 10 14 10
4 5 6 7 4 5 6 7 4 5 6 7

8
14 9 10
16 9 3 8
14 9 10
7 9 3 8
8 9 10
7 9 3
2 8 7 2 8 1 2 4 1
79
Example: heapify
16 1

14 10 max = 16 14 10
8 7 9 3 8 7 9 3
2 4 1 2 4
Heap size decreased with 1

14

Call MAX-HEAPIFY(A, 1,)


8 10
4 7 9 3
2 1

80
MAX-HEAPIFY Running Time
• Intuitively:
- h
-
- 2h
- O(h)

• Running time of MAX-HEAPIFY is O(lgn)


– Since the height of the heap is lgn

81
Running Time of BUILD MAX HEAP

Alg: BUILD-MAX-HEAP(A)
1. n = length[A]
2. for i ← n/2 downto 1
O(n)
3. do MAX-HEAPIFY(A, i) O(lgn)

Running time: O(nlgn)


Hence running time of heapsort is O(nlgn)

82
Problem
• Demonstrate, step by step, the operation of
Heap sort on the array
A=[5, 3, 17, 10, 84, 19, 6, 22, 9]

83
Shell Sort
• Invented by Donald Shell in 1959.
• Shell sort works by comparing elements that are
distant rather than adjacent elements in an
array.
• Shellsort uses a sequence h1, h2, …, ht called
the increment sequence.
• Shellsort makes multiple passes through a list
and sorts a number of equally sized sets using
the insertion sort.

84
Shell Sort
• Shellsort is also known as diminishing
increment sort.
• The distance between comparisons decreases
as the sorting algorithm runs until the last phase
in which adjacent elements are compared.
• Increment (By Donald Shell)= Floor(n/2)

85
Shell Sort(Example)
Sort: 18 32 12 5 38 33 16 2
8 Numbers to be sorted, Shell’s increment will be floor(n/2)

increment 4: 1 2 3 4

18 32 12 5 38 33 16 2

Step 1) Only look at 18 and 38 and sort in order ;


18 and 38 stays at its current position because they are in order.
Step 2) Only look at 32 and 33 and sort in order ;
32 and 33 stays at its current position because they are in order.

86
Shell Sort(Example)
Sort: 18 32 12 5 38 33 16 2

increment 4: 1 2 3 4

18 32 12 5 38 33 16 2

Step 3) Only look at 12 and 16 and sort in order ;


12 and 16 stays at its current position because they are in order.
Step 4) Only look at 5 and 2 and sort in order ;
2 and 5 need to be switched to be in order.
87
Shell Sort(Example)
Sort: 18 32 12 5 38 33 16 2
Resulting numbers after increment 4 pass:
18 32 12 2 38 33 16 5
* floor(4/2) ➔ floor(2) = 2

increment 2: 1 2

18 32 12 2 38 33 16 5
Step 1) Look at 18, 12, 38, 16 and sort them in their appropriate location:
12 32 16 2 18 33 38 5
Step 2) Look at 32, 2, 33, 5 and sort them in their appropriate location:
12 2 16 5 18 32 38 33

88
Shell Sort(Example)
Sort: 18 32 12 5 38 33 16 2

* floor(2/2) ➔ floor(1) = 1
increment 1:
12 2 16 5 18 32 38 33

2 5 12 16 18 32 33 38

The last increment or phase of Shellsort is basically an Insertion


Sort algorithm.

89
Problem
Sort the following elements using shell sort.
80, 93, 60, 12, 42, 30, 68, 85, 10

90
Counting Sort
• Assumption:
– The elements to be sorted are integers in the range 0
to k
• Idea:
– Determine for each input element x, the number of
elements smaller than x
– Place element x into its correct position in the output
array

91
Counting Sort
Alg.: COUNTING-SORT(A, B, n, k)
1. for i ← 0 to k
2. do C[ i ] ← 0
3. for j ← 1 to n
4. do C[A[ j ]] ← C[A[ j ]] + 1
5. C[i] contains the number of elements equal to
i
6. for i ← 1 to k
7. do C[ i ] ← C[ i ] + C[i -1]
8. C[i] contains the number of elements ≤ i
9. for j ← n downto 1
10. do B[C[A[ j ]]] ← A[ j ]
11. C[A[ j ]] ← C[A[ j ]] - 1
92
Example
0 1 2 3 4 5
1 2 3 4 5 6 7 8

A 2 5 3 0 2 3 0 3 C 2 2 4 7 7 8
0 1 2 3 4 5

C 2 0 2 3 0 1

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

B 3 B 0 3
0 1 2 3 4 5 0 1 2 3 4 5

C 2 2 4 6 7 8 C 1 2 4 6 7 8

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

B 0 3 3 B 0 2 3 3
0 1 2 3 4 5 0 1 2 3 4 5

C 1 2 4 5 7 8 C 1 2 3 5 7 8
93
Example (cont.)
1 2 3 4 5 6 7 8

A 2 5 3 0 2 3 0 3

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

B 0 0 2 3 3 B 0 0 2 3 3 3 5
0 1 2 3 4 5 0 1 2 3 4 5

C 0 2 3 5 7 8 C 0 2 3 4 7 7

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

B 0 0 2 3 3 3 B 0 0 2 2 3 3 3 5
0 1 2 3 4 5

C 0 2 3 4 7 8

94
Analysis of Counting Sort
Alg.: COUNTING-SORT(A, B, n, k)
1. for i ← 0 to k (k)
2. do C[ i ] ← 0
3. for j ← 1 to n
(n)
4. do C[A[ j ]] ← C[A[ j ]] + 1
5. C[i] contains the number of elements equal to i
6. for i ← 1 to k
(k)
7. do C[ i ] ← C[ i ] + C[i -1]
8. C[i] contains the number of elements ≤ i
9. for j ← n downto 1
10. do B[C[A[ j ]]] ← A[ j ] (n)

11. C[A[ j ]] ← C[A[ j ]] - 1


Overall time: (n + k) 95
Radix Sort
• Considers keys as numbers in a base-R
number
– A d-digit number will occupy a field of d
columns
• Sorting looks at one column at a time
– For a d digit number, sort the least significant
digit first
– Continue sorting on the next least significant
digit, until all digits have been sorted
– Requires only d passes through the list

96
RADIX-SORT
Alg.: RADIX-SORT(A, d)
for i ← 1 to d
do use a stable sort to sort array A on digit i
• 1 is the lowest order digit, d is the highest-order digit

97
Analysis of Radix Sort

• Given n numbers of d digits each, where each

digit may take up to k possible values, RADIX-

SORT correctly sorts the numbers in (d(n+k))

– One pass of sorting per digit takes (n+k) assuming

that we use counting sort

– There are d passes (for each digit)

98
Bucket Sort
• Assumption:
– the input is generated by a random process that distributes
elements uniformly over [0, 1)
• Idea:
– Divide [0, 1) into n equal-sized buckets
– Distribute the n input values into the buckets
– Sort each bucket
– Go through the buckets in order, listing elements in each one

• Input: A[1 . . n], where 0 ≤ A[i] < 1 for all i


• Output: elements ai sorted
• Auxiliary array: B[0 . . n - 1] of linked lists, each list
initially empty
99
BUCKET-SORT
Alg.: BUCKET-SORT(A, n)
for i ← 1 to n
do insert A[i] into list B[nA[i]]
for i ← 0 to n - 1
do sort list B[i] with insertion sort
concatenate lists B[0], B[1], . . . , B[n -1]
together in order
return the concatenated lists

100
Example - Bucket Sort
1 .78 0 /

2 .17 1 .17 .12 /

3 .39 2 .26 .21 .23 /


4 .26 3 .39 /

5 .72 4 /

6 .94 5 /

7 .21 6 .68 /

8 .12 7 .78 .72 /

9 .23 8 /
10 .68 9 .94 /
101
Example - Bucket Sort
.12 .17 .21 .23 .26 .39 .68 .72 .78 .94 /

0 /

1 .12 .17 /

2 .21 .23 .26 /

3 .39 /

4 /

5 /

6 .68 /

7 .72 .78 /
Concatenate the lists from
8 / 0 to n – 1 together, in order
9 .94 / 102
Analysis of Bucket Sort
Alg.: BUCKET-SORT(A, n)

for i ← 1 to n
O(n)
do insert A[i] into list B[nA[i]]

for i ← 0 to n - 1
(n)
do sort list B[i] with insertion sort

concatenate lists B[0], B[1], . . . , B[n -1]


together in order O(n)

return the concatenated lists


(n)
103
Medians and Order Statistics
Def.: The i-th order statistic of a set of n elements is the i-th
smallest element.

• The minimum of a set of elements:


– The first order statistic i = 1
• The maximum of a set of elements:
– The n-th order statistic i = n
• The median is the “halfway point” of the set
– i = (n+1)/2, is unique when n is odd
– i = (n+1)/2 = n/2 (lower median) and (n+1)/2 = n/2+1 (upper
median), when n is even

104
Finding Minimum or Maximum
Alg.: MINIMUM(A, n)
min ← A[1]
for i ← 2 to n
do if min > A[i]
then min ← A[i]
return min

• How many comparisons are needed?


– n – 1: each element, except the minimum, must be compared to
a smaller element at least once
– The same number of comparisons are needed to find the
maximum

105
Simultaneous Min, Max
• Find min and max independently
– Use n – 1 comparisons for each  total of 2n – 2

• At most 3n/2 comparisons are needed


– Process elements in pairs
– Maintain the minimum and maximum of elements seen so far
– Don’t compare each element to the minimum and maximum
separately
– Compare the elements of a pair to each other
– Compare the larger element to the maximum so far, and
compare the smaller element to the minimum so far
– This leads to only 3 comparisons for every 2 elements
106
Analysis of Simultaneous Min, Max
• Setting up initial values:
– n is odd: set both min and max to the first element

– n is even: compare the first two elements, assign the smallest


one to min and the largest one to max

• Total number of comparisons:


– n is odd: we do 3(n-1)/2 comparisons

– n is even: we do 1 initial comparison + 3(n-2)/2 more


comparisons = 3n/2 - 2 comparisons

107
Example: Simultaneous Min, Max

• n = 5 (odd), array A = {2, 7, 1, 3, 4}

1. Set min = max = 2

2. Compare elements in pairs:

– 1 < 7  compare 1 with min and 7 with max


3 comparisons
 min = 1, max = 7

– 3 < 4  compare 3 with min and 4 with max


3 comparisons
 min = 1, max = 7

We performed: 3(n-1)/2 = 6 comparisons


108
Example: Simultaneous Min, Max
• n = 6 (even), array A = {2, 5, 3, 7, 1, 4}

1. Compare 2 with 5: 2 < 5 1 comparison

2. Set min = 2, max = 5

3. Compare elements in pairs:

– 3 < 7  compare 3 with min and 7 with max


3 comparisons
 min = 2, max = 7

– 1 < 4  compare 1 with min and 4 with max


3 comparisons
 min = 1, max = 7

We performed: 3n/2 - 2 = 7 comparisons


109

You might also like