$RG3WPPN
$RG3WPPN
CHAPTER - 1
ALGORITHM SPECIFICATION
Introduction
Definition:
1. Input. There are zero or more quantities that are externally supplied.
2. Output. At least one quantity is produced.
3. Definiteness. Each instruction is clear and unambiguous
4. Finiteness. If we trace out the instructions of an algorithm then for all cases, the
algorithm terminates after a finite number of steps.
5. Effectiveness. Every instruction mu7st be basic enough to be carried out, in principle,
by a person using only pencil and paper. It is not enough that each operation be
definite as in (3); it also must be feasible.
Example [Binary search]: Assume that we have n ≥ 1 distinct integers that are already sorted
and stored in the array list. That is, list [0]≤ list [1]≤ …≤ list [n-1]. We must figure out if an
integer seachnum is in this list. If it is we should return an index, i, such that list[i] =
searchnum. If searchnum, isnot present, we should return- 1. Since the list is sorted we may
use the following method to search for the value.
Let left and right, respectively, denore the left and right ends of the list to be searched.
Initially. left = 0 and right= n-1. Let middle = (left+right)/2 be the middle position in the list.
If we compare list with seachnum,we obtain oe of three reults:
1. searchnum < list [midddle]. In this case, if searchnum is present, it must be in the
positions between 0 and middle – 1. Therefore,we set return middle.
1
2. searchnum = list [middle]. In this case, we return middle.
3. searchnum > list [ middle].In this case, if searchnum is present, it must be in the positions
between middle + 1 and n-1. So, we set left to middle + 1.
# include <math.h>
int i, n;
scanf(“%d “, &n)‟
printf( “% d “, list[i];
sort (list,n);
2
printf(“ \n Sorted array:\n”);
Printf(“% d “, list[i]);
printf(“\n”);
{ min = I;
+) if (list[j] <
list[min]) min= j;
Basic concepts
If searchnum has not been found and there are still integers to check, we recalculate middle
and continue the search. Program 1.5 implements this searching strategy. The algorithm
contains two subtasks: (1) determining if there are any integers left to check, and (2)
comparing seachnum to list [middle].
3
Return middle;
We can handle the comparisons through either a function or a macro. In either case,
we must specify values to signify less than, equal, or greater than. We will use the strategy
followed in C‟ s library function:
We return a negative number (-1) if the first number is less than the second.
We return a 0 if the two numbers are equal.
We return a positive number (1) if the first number is greater than the second.
Although we present both a function (program 1.6) and a macro, we will use the macro
throughout the text since it works with any data type.
1 for greater*/
Else if ( x = = y ) return 0;
Else return 1;
int right)
{/* search list [0] < = list [1] <=… <= list [n-1] for
return – 1* /
int middle;
4
while (left< = right ) {
break;
return -1;
The search strategy just outlined is called binary search. The previous examples have
shown that algoriths are implemented as functions in C. Indeed functions are the primary
vehicle used to divide a large program into managea ble pieces.
They make the program easier to read, and, because the functions can be tasted separately,
increase the probability that it will run correctly.
Recursive Algorithms
{/* search list [0] < = list [1] <=… <= list [n-1] for
5
searchnum. Returnits position if found. Otherwise
return – 1* /
int middle;
if (left< = right ) {
{ case -1 : return
case 1: return
return -1;
The factorial function n ! has value 1 when n ≤ 1 and value n* (n-1)! When n> 1. Write
both a recursive and an iterative C function to compute n!.
The Fibonacci numbers are defined as : f 0 = 0, f1= 1, and f i = f i -1 + f i-2 for i >1.
DATA ABSTRACTION
All programming languages provide at least a minimal set of predefined data types, plus the
ability to construct new, or user – defined types. It is appropriate to ask the question, “ what
is a data type?”
6
Definition: A data type is a collection of objects and a set of operations that act on those
object.
Whether your program is dealing with predefined data types or user- defined data types, these
two aspects must be considered: objects and operations.
It has been observed by many software designers that hiding the representation of objects of a
data type from its users is a good design strategy. In that case, the user is constrained to
manipulate the objects solely through the functions that are provided.
Definition: An abstract data type ( ADT ) is a data type that is organized in such a way that
the specification of the objects and the specification of the operations on the objects is
separated from the representation of the objects and the implementation of the operations.
How does the specification of the operations of an ADT differ from the implementation of
the operations? The specification consists of the names of every function, the type of its
arguments, and the type of its result. There should also be a description of what the function
does, but without appealing to internal representation or implementation details. This
requirement is quite important, and it implies that and abstract data type is implementation –
independent.
Objects: an ordered subrange of the integers starting at zero and ending at the maximum
integer ( INT _ MAX) on the computer
Functios:
7
Booleaqn Equal (x,y) ::= if (x==y) return truee else return false
Natural Number Add (x,y) ::= if (( x+1) < = INT _ MAX) return x + y
Else return x – y
EXERCISE
Add the following operations to the Natural Number ADT: Predecessor, IsGreater, Multiply,
Divide.
Create an ADT, Boolean. The minimal operations are And, Or, Not,Xor (Exclusiveor ), and
Implies.
PERFORMANCE ANALYSIS
There are many criteria, upon which we can judge a program, including:
Although the above criteria are vitally important particularly in the development of large
systems , it is difficult to explain how to achieve them. The criteria are associated with the
development of a good programming style and this takes experience and practice. We also
can judge a program on more concrete criteria, and so we add two more criteria.
These criteria focus on performance evaluation, which we can loosely devide into two
distinct fields. The first fields focuses on obtaining estimates of timer and space that are
8
machine independent. We call this field performance analysis, but is subject matter is the
heart of an important branch of computer science known as complexity theory. The second
field, which we call performance measurement, obtains machine-dependent running times.
Definition: The space complexity of a program is the amount of memory that it needs to run
to completion. The time complexity of a program is the amount of computer time that it
needs to run to completion.
Space Complexity
1. Fixed space requirements: This component refers to space requirements that do not
Depend on the number and size of the program‟s inputs and outputs. The fixed requirements
include the instruction space (space needed to store the code), space for simple variable‟s
fixed- size structured variables( such as structs) and constants.
This component consists of the space needed by structured variables whose size
depends on the particular instance, I, of the problem being solved. It also includes the
additional space required when a function uses recursion. The variable space requirement of a
program P working on an instance I is denoted Sp (I).
Time Complexity
The time, T(P), taken by a program, P, is the sum of its compile time and its run (or
execution) time. The compile time is similar to the fixed space component since it does not
depend on the instance characteristics, we are really concerned only with the program‟s
execution time, Tp.
9
That are performed when the program is run with instance characteristic
Our motivation to determine step counts is to be able to compare the time complexities of two
programs that compute the same function and also to predict the growth in run time as the
instance characteristics change.
Determining the exact step count( either worst case or average) of a program can
prove to be an exceedingly difficult task. Expending immense effort to determine the step
count exactly isn‟t very worthwhile endeavor as the notion of a step is itself inexact. (Both
the instructions x = y and x=y+z+(x/y) + (x*y*z*-x/z) count as one step.) Because of the
inexactness of what a step stands for, the exact step count isn‟t very
Int I, j, k;
C[i] [j] = 0;
10
Int i, j, temp;
Definition: [ Big „ oh‟] f (n) = O (g(n)) (read as „ „ f of n is big oh of g of n‟ ) iff (if and
only if ) there exist positive constants c and n 0 such that f(n) ≤ cg (n) for all n,n ≥ n0.
As illustrated by the previous example, the statement f(n) = O(g(n)) only states that
g(n) is an upper bound on the value of f(n) for all n,n ≥ n0.
The theta notation is more precise than both the „ big oh‟ and omega
notations. f (n) = Ө(g(n)) iff g (n) is both an upper and lower bound on f(n).
11
CHAPTER 2
ARRAYS AND
STRUCTURES
ARRAYS
ADT Array is
Objects: A set of pairs< index, value > where for each value of index. There is a value fro, the
set item. Index is a finite ordered set of one or more dimensions, for example, { 0,…, n-1 }
for oe dimension, {(0,0),(0,1),(0,2),(1,0),(1,1),(1,2),(2,0),(2,1),(2,2) for twp dimensions, etc.
Functions:
Array Create (j, list) ::= return an array of j dimensions where list is a j- tuple whose ith
element is the size of the ith dimension. Items are undefined.
Item Retrieve (A,i) ::= if (I in index) return the items associated with index value I in array
A else return error
Return an array that is identical to array A except the new pair <I,x>
has been inserted else return error.
Returns the value associated with the index if the index is valid, or an error if the
index is invalid. Store accepts an array, an index, and an item, and returns the original array
augmented with the new < index, value> pair. The advantage of this ADT definition is that it
clearly points out the fact that the array is a more general structure than “a consecutive set of
memory locations.”
12
POLYNOMIALS
Array are not only data structures is their own right, we can also use them to implement other
abstract data types. For instance, let us consider one of the simplest and most commonly
found data structures: the order or linear list.
on lists, including:
13
The largest (or leading) exponent of a polynomial is called its degree. Coefficients
that are zero are not displayed. The term with exponent equal to zero does not show the
variable since x raised to a power of zero is 1.
Polynominal Representation
We are now ready to make some representation decisions. A very reasonable first
decision requires unique exponents arranged in decreasing order. This requirement
considerably simplifies many of the operations.
This algorithm works by comparing terms from the two polynominals until one or
both of the polynomials becomes empty. The switch statement performs the comparisons and
adds the proper term to the new polynomial, d. if one of the polynominals becomes empty, e
copy the remaining terms from the nonempty polynomial into d. With these insights, we now
considered the representation question more carefully.
ADT Polynomial is
Objects: p(x) = a 1 x e1 +… + anxen; a set of ordered pairs of < ei, ai> where a I in Coefficients
and ei in Exponents, ei are integers >=0
Functions:
For all poly, poly 1, poly 2€ Polynomial, coef € Coffucients, expon € Exponents
Boolean Is Zero ( poly) ::= if (poly) return FALSE else return TRUE
Coefficient Coef (poly, expon) ::= if ( expon € poly) retrun is cofficient else return
zero
Exponent Lead Exp (poly) ::= return the largest exponent in poly
Polynomial Attach (poly, coef, expon)::= if( expon € poly) return error else return the
polynomical poly with the term<coef, expon>
inserted
14
Polynomial Remove (poly,expon) ::= if (expon € poly) return the polynomical poly
with the term whose exponent is expon deleted
else return error
Polynomial Single Mult( poly, coef, expon) ::= return the polynomial poly. coef. X expon
Polynomial Add ( poly 1, poly 2) ::= return the polynomial poly 1+ poly 2
Polynomical Addition
We would now like to write a C function that adds two polynomials, A and B,
represented as above to obtain D = A+B. To produce D(x), padd adds A (x) and B (x) term by
term.
Analysis of Padd:
Since the number of nonzero terms in A and B are the most important factors in the
time complexity, we will carry out the analysis using them. Therefore, let m ad n be the
number of nonzeroterms in A and B, respectively. If m > 0 and n> 0, the while loop is
entered. Each interation of the loop requires o (1) time. At each iteration, we increment the
value of start A or start B or both. Since the iteration terminates
_TERMS) {
When either start A or start B exceeds finish A or finish B, respectively, the number of
iterations is bounded by m+n -1. This worst case occurs when:
The time for the remaining two loops is bounded by O(n + m) because we cannot
iterate the first loop more than m times and the second more than n times. So, the asymptotic
computing time of this algorithm is O(n+m).
EXERCISE
Write a function, pmult, that multiplies two polynomials, Figure out the computing
time of your function.
Let A(x) = x 2n
+ x2n-2 + .. + x2 +x0 and B(x) = x2n+1 x2n+ …. + x3 +x. For these
polynomials, determine the exact number of times each statement of padd is executed.
SPARSE MATRICES:
A general matrix consists of m rows and n columns of numbers. Such a matrix has sm
n elements. Matrices has many zero entries. Such a matrix is called sparse.
468 001
502 000
2X3 2X3
REFRESENTATION OF ARRAYS:
If an array is declared A(I1:u1,I2:u2 …., In:un), then it is easy to see that the number of
elements is
𝑛
𝖦(𝑢1 − 𝐼𝑖 +
1)
1=1
16
One of the common ways to represent an array is in row major order. If we have the
declaration.
17
A (4:5, 2:4, 1:2, 3:4)
The two dimensional array (A1:u1), 1u2) may be interpreted as u1 rows : row1.
row2,…, each row consisting of u2 elements.
If α is the address of A(1,1), then the address of (A I, 1) is α + (I -1) u2, as there are
i - 1 rows each of size u2 preceding the first element in the i-th row. Knowing the address of
A(I -1), we can say that the address of A(I, I) is then simply α + (I, 1), we can say that the
address of A(I, 1) is then simply α + (i-1) u2 + (i - 1).
(b) Sequential row major representation of a 3-dimensional array. Each 2-dimensional array
is represented as in Figure
+ (i2 - 1) u3 u4…un
+ (i3 - 1) u4 u5 …un
+ (in-1 - 1)un
+ (in - 1)
18
UNIT – II
CHAPTER - 3
FUNDAMENTALS:
A stack is an ordered list in which all insertions and deletions are made at one end,
called the top. a queue is an ordered list in which all insertions take place at one end, the rear,
while all deletions take place at the other end, the front.
The restrictions on a stack omply that if the elements A, B, C, D, E are added to the
stack, in that order, then the first element to be removed / deleted must be E. equivalently we
say that the last element to be inserted into the stack will be the first to be removed. For the
reason stacks are sometimes referred to as Last In First Out (LIFO) lists. The restrictions on a
queue require that the element which is inserted into the queue will be the first one to be
removed. Thus a is the first letter to be removed, and queues are known as First in First Out
(FIFO) lists.
*insert item into the STACK of maximum size n; top is the number of elements currently in
STACK*
top ←top + 1
STACK (top)
*remove the top elements of STACK and stores it in item unless STACK is empty*
item←STACK
(top) top←top – 1
end DELETE
rear←rear + 1
Q (rear) ←
item end
ADDQ
item ←Q
(front) end
DELETE
Circular Queue
*insert item into the circular queue stored in Q(0:n-1); rear points to the last item and front is
one position, counter clock wise from the first item in Q*
FULL
end ADD
queue*
end DELETE
20
EVALUATION OF EXPRESSIONS:
21
(4/4)+ 9-8
Result -2
(4/2)** 5 * -1 *2
(2**)* -2
32* -2
Result = -64
For example A* (B + C) * D has the postfix form ABC + * D*, and so the algorithm
Infix to Postfix:
A Empty A
* * A
( * A
B * AB
+ + AB
C + ABC
) * ABC+
* * ABC+*
D * ABC+* D
22
The rule will be that operators are taken out of the stack as long as their in- stack
priority is p, is greater than equal to the incoming priority, icp of the new operator.
Procedure POSTFIX(E)
// Convent the infix expression E to postfix. Assume the last character of E is a „4‟ which
will also be the last character of the postfix. Procedure NEXT0TOKEN returns either the
next the operator, operand or delimiter- whichever comes next.
Stack ( 1:n ) is the used as a stack and the character „4‟ with
ISP( /‟4‟)= - 1 is used at the bottom of the stack. ISP and ICP are
Case
End
Print („4‟)
Return
end
end
end
forever
end POSTFIX
23
EXERCISES
a) a*b*c
b) –a + b –c +d
c) a* -b + c
d) (a + b) * d+ e / (f + a* d) + c
6. Write a C function that transforms a prefix expression into a postfix one. Carefully state
any assumptions you make regarding the input. How much time and space does your function
take?
7. Write a C function that transforms a postfix expression into a prefix one. How much time
and space does your function take?
8. We must represent two stacks in an array, memory. [MEMORY_ SIZE]. Write C functions
that add and delete an item from stack i, 0< I < n. Your functions should be able to add
elements to the stacks as long as the total number of elements in both stacks is less than
MEMORY_ SIZE-1.
9. Write a C function that implements the stack Full strategy discussed in the text.
10. Using the add and delete functions discussed in the text and stack Full form Exercise 3,
Produce a sequence of additions / deletions that requires O (MEMORY _ SIZE) time for
each add. Assume that you have two stacks and that your are starting from a configuration
representing a full utilization of memory (MEMORY _ SIZE).
24
TREES
INTRODUCTION
ii. The remaining nodes are portioned into n≥ 0 disjoint tress T1, T2,… Tn where each of these
sets is a tree. T1,… Tn, are also called subtrees of the root.
There are many terms which are often used when referring to trees. They are many
terms which are often used when referring to trees. They are defined one by one in the
following paragraphs.
A node stands for the item of information plus the branches to other items. The tree in
the following figure has 13 nodes, each item of data being a single letter for convenience.
The root is „A‟ and trees are normally drawn with the root at the top. The number of subtrees
of a node is called its degree. So, in the tree below, the degree of „A‟ is 3 of c is 1 and of F is
Zero. Nodes that have degree zero are called leaf or terminals nodes. Therefore,
{K,L,F,G,M,I,J} is the set of leaf nodes. The nodes with non- zero degrees are called non
terminals. The roots of the subtrees of a node, say X, are the children, of X. X is the parent of
its children. Thus the children of A are B, C and D. Also H,I and J are iblings. This
terminology care be extended so that, we can ask for the grandparent of a node which is the
parent of the parents of the node. For example grant of M is D.
The degree of a maximum degree of the nodes in the tree. For example the degree of
the above tree is 3. The ancestors of a node are all the nodes along the path from the root to
that node. The ancestors of M are A, D and H. the level of a node is defined by in initially
letting the root be at level 1. If a node is at level „n‟, then its children are at level „n+1‟. The
weight or depth of a tree is defined to be the maximum level of any node in the trees.
25
A forest is a set of n ≥ 0 disjoint trees. The nation of a forest is closely related to that
of a tree, because, if we remove the root of a tree we get a forest.
BINARY TREE:
A binary tree is a finite set of nodes which is either empty or consists of a root and
two disjoint trees called the left subtree and the right sub tree.
Example:
The maximum number of nodes on level „I‟ of a binary tree is 2 i-1, i> =1.
∑ 2𝑖 − 1 = 2𝑘 − 1
𝑖=1
BINARY TREE
TRAVERSAL:
When traversing a binary tree we want to treat each node and its subtrees in the same
fashion. If we let L, D, R, stand for moving left, printing the data, and moving right when at a
node then there are six possible combinations of traversal: LDR, LRD, DLR, DR, RDL, and
RLD. IF we adopt the convention that we traverse left before right then only three traversals
remain: LDR, LRD and DLR. To these we assign the names inorder, postorder and preorder
and preorder because there is a natural correspondence between these travels and producing
the infix, postfix and prefix forms of an expression.
Procedure INORDER(T)
*T is a binary tree where each node has three fields L- CHILD. DATA, RCHILD* if T f 0
then [call INORDER (LCHILD(T))
print (DATA(T))
call ( INORDER
Output : A/B * *8 C* D + E
procedure PREORDER(T)
26
*T is a binary tree where each node has three fields L-CHILD. DATA, RCHILD* if T f0 then
[print (DATA(t))
call PREORDER(LCHILD(T))
call PREORDER
print (DATA(T))]
end POSTORDER
Output : A B C * * / D * E +
We can develop equivalent iterative functions for teunsive traversal. Let us take
inorder traversal as an example. To simulate the recursion, we must create our own stack.
A node that has no action indicates that the node is added to the stack, while a node
that has a Printf action indicates that the node is removed from the stock. Notice that the left
nodes are stacked until a null node is reached, the node is then removed from the stack, and
the node‟s right child is stacked. The traversal then continues with the left child. The
traversal is complete when the stack is empty.
Analysis of iterInorder: Let n be the number of nodes in the tree. If we consider the
action of iterInorder, we note that every node of the tree is placed on and removed from the
stack exactly once. So, if the number if nodes in the tree is n, the time complexity is O (n).
The space requirement is equal to the depth of the tree which is O(n).
27
trepointer stack
[MAZ_STACK_SIZE]; for (; ;) {
The idea is to replace the null links by pointers, called threads, to other nodes in the
tree If the RCHILD(P) is normally equal to zero, we will replace it by a pointer to the node
which would be printed after P when traversing the tree in inorder. A null LCHILD link at
node P is replaced by a pointer to the node which immediately precedes node P in inorder.
The tree T has 9 nodes and 10 null links which have been replaced by threads. If we
traverse T in inorder the nodes will be visited in the order HDIBEAFCG.
For example node E has a predecessor thread which points to B and a successor thread which
points to A.
28
procedure INSUC(X)
S← RCHIULD
(X)
if RBIT(X) = I then
S← LCHILD(S) * uni
end ]
return(S)
end INSUC
HEAD ← T
loop
if T = HEAD then
forever
end TINORDER
S*
29
if RBIT(T) ← I [W INSUCT(T) * S had a right child *
LCHILD(W) ← T]
30
end INSERT_ RIGHT
1. Write out the inorder, preorder, postorder, and level-order treversals for the binary
trees of Figure 5.10.
2. Do Exercise 1for the binary tree of Figure 5.11.
3. Do Exercise 1for the binary tree of Figure 5.15.
EXERCISES
1. Draw the binary tree of Figure 5.15, showing its threaded representation.
2. Write a function, insertLeft, that inserts a new node, child, as the left child of node parent
in a threaded binary tree. The left child pointer of parent becomes the left child pointer of
child.
3. Write a function that traverses a threaded binary tree in postorder. What are the time and
space requirements of your method?
1. Suppose that each node in a binary search tree also has the field leftsize as described in
the text. Write a function to insert a pair into such a binary search tree. The complexity
of your function should be O(h), where h is the height of the search tree. Show that is the
case.
2. Any algorithm that merges together two sorted lists of size n and m, respectively, must
make at least n + m – 1 comparisons in the worst case. What implications does this result
have on the time complexity of any comparison- based algorithm that combines two
binary search trees that have n and m pairs, respectively?
Definition:
A binary search tree T is a binary tree; either it is empty or each node in the tree
contains an identifier. All identifiers in the left subtree of T are less (Numerically or
Alphabetically) than the identifier in the rot node T. The left and Right subtree are also binary
search tree.
Function Binary search(keys, X Low, High Result) Given a vector keys (an input
parameter) whose elements are in ascending order, this procedure searches the vector for a
given element whose value is given by the input parameter X and returns in the position in
the vector to the calling program in the parameter result. Low & High are input parameter
31
defining the current search interval. Initially low denotes the first subscript of vector keys, so
that keys (Low) is the smallest value and High denotes the last subscript of the vector keys,
so that keys (High) is the largest value middle denotes the midpoint of the interval.
1. Is desired element absent If low High and keys (Low) f X then wirte (search is un
successful) Result ←0
Return
end if
3. Compare
if keys (middle) = x
end if
end if
*given the input parameters root and item, as described previously, this function performs a
search operation on the tree structure. The node structure (NODE) contains a left pointer
(LPTR) an item description (INFO) and a right pointer (RPTR). The is initially invoked with
the head node of a tree and recursively searches the tree for a node whose info field matches
item*
32
*node found*
if item = INFO(ROOT)
end if
end if
end if
FORESTS
To transform a forest into a single binary tree, we first obtain the binary tree
representation of each of the trees in the forest and then link these binary trees through the
right child field of the root nodes. Using this transformation, the forest or Figure 5.35
becomes the binary tree of Figure
33
Fig : B – Binary tree representation of forest of fig. A
Definition: If T1…, Tn is a forest of trees, then the binary tree corresponding to this forest,
denoted by B (T1,..,Tn),
1) Is empty if n= 0
2) Has root equal to root (T1); has left subtree equal to B( T11, T12,…,T1m), whereT11,…,
are the subtrees of root (T1); and has right subtree B (T2,…, Tn).
34
GRAPHS
Introduction
Graphs have been used in a wide variety of applications. Some of these applications
are: analysis of electrical circuits, finding shortest routes, project planning, identification of
chemical compounds, statistical mechanics, genetics, cybernetics, linguistics, social sciences,
and so on. Indeed, it might well be said that of all mathematical structures, graphs are the
most widely used.
Definitions
3 0
1 1 2 1
2
2
3 3 4 5 6
G1 G2 G3
35
V(G1) = {0,1,2,3}; E(g1) = {(0,1), (0,2), (0,3), (1,2), (1,3), (2,3)}
Notice that the edges of a directed graph are drawn with an arrow from the tail to the
head. The graph G2 is a tree; the graphs G1 and G3 are not.
Since we define the edges and vertices of a graph assets, we impose the following
restrictions on graphs:
1) A graph may not have an edge from a vertex, v, back to itself. That is, edges of the
form (v,v) and <v, v> are not legal. Such edges are known as self edges or self loops.
If we permit self edges, we obtain a data object referred to as a graph with self edges.
An example is shown in Figure 6.3(a).
1 1 3
0
2
2
2) A graph may not have multiple occurrences of the same edge. If we remove this
restriction, we obtain a data object referred to as a multigraph (see Figure 6.3 (b)).
The number of district unordered pairs (u,v) with u v in a graph with n vertices is
n(n-1)/2. This is the maximum number of edges in any n-vertex, undirected graph. An n-
vertex, undirected graph with exactly n(n-1)/2 edges is said to be complete. The grapn G 1 of
Figure 6.2(a) is the complete graph on four vertices, whereas G 2 and G3 are not compete
graphs. In the case of a directed graph on n vertices, the maximum number of edges in n(n-1)
36
If (u,v) is an edge in E(g), then we shall say the vertices u and v are adjacent and that
the edge (u,v) is incident on vertices u and v. The vertices adjacent to vertex 1 in G 2 are 3, 4
and 0. The edges incident on vertex 2 in G2 are (0,2), (2,5), and (2,6). If <u,v> is a directed
edge, then vertex u is adjacent to v, and v is adjacent from u. The edge <u,v> is incident to u
and v. In G3, the edges incident to vertex 1 are <0,1>, <1,0>, and <1, 2>.
A subgraph of G is a graph G such that V(G) ⊆ V(G) and E(G) ⊆ E(G). Figure 6.4
shows some of the subgraphs of G1 and G3.
0 0
0
1 2
1 2
3
1 2
0 0 0
0
1
1
1
2 2 2
37
A path from vertex u to vertex v in graph G is a sequence of vertices u, i1,i2, …., ik, v
such that (u,i1), (i1, i2), …., (I,, v) are edges in E(G). The length of a path is the number of
edges on it. A Simple path is a path in which all vertices except possibly the first and last are
district. A path such as (0,1) (1,3) (3,2) is also written as 0,1,3,2. Paths 0,1,3,2 and 0,1,3,1 of
G1 are both of length 3. The first is a simple path; the second is not. 0,1,2 is a simple directed
path in G3. 0,1,2,1 is not a path in G3, as the edge <2, 1> is not in E(G3).
A cycle is a simple path in which the first and last vertices are the same. 0,1,2,0 is a
cycle in G1. 0,1,0 is a cycle in G3. For the case of directed graphs we normally add the prefix
“directed” to the terms cycle and path.
In an undirected graph. G, two vertices u and v are said to be connected iff there is a
path from v to u). An undirected graph is said to be connected iff for every pair of district
vertices u an v in V(G) there is a path from u to v in G. Graphs G 1 and G2 are connected,
whereas G4 of Figure 6.5 is not. A connected component (or simply a component), H of an
undirected graph is a maximal connected subgraph. By maximal, we mean that G contains no
other subgraph that is both connected and properly contains H. G 4 has two components, H1
and H2 (see Figure 6.5)
0 4
2 1 56
3
7
G4
38
A tree is a connected a cyclie (ie. has no cycles) graph.
A directed graph G is said to be strongly connected iff for every pair of district
vertices u and v in V(G), there is a directed path from u to v and also from v to u. The
graph G3 is not strongly connected as there is no path from vertex 2 to 1. A strongly
connected component is a maximal subgraph that is strongly connected. G 3 has two
strongly connected components (see Figure 6.6).
The degree of a vertex is the number of edges incident to that vertex. The degree of
vertex 0 in G1 is 3. If G is a directed graph, we define the in-degree of a vertex v to be the
number of edges for which v is the head. The out-degree is defined to be the number of
edges for which v is the tail. Vertex 1 of G3 has in-degree 1, out-degree 2, and degree 3.
If d, is the degree of vertex I in a graph G with n vertices and e edges, then the number of
edges is
n1
e = ( di ) / 2
i0
ADT Graph is
Objects: a nonempty set of vertices and a set of undirected edges, where each edge is
a pair of vertices.
39
Functions:
Graph Insert Edge (graph, v1, v2) ::= return a graph with a new edge
between V1 and V2
Graph Delete Vertex (graph, v) ::= return a graph in which v and all
Graph Delete Edge (graph, v1, v2) ::= return a graph in which the edge
are adjacent to v.
The operations in ADT. 6.1 are a basic set in that allow us to create any arbitrary
graph and do some elementary tests. In the later sections of this chapter we shall see
functions that traverse a graph (depth first or breadth first search) and that determine if a
graph has special properties (connected, biconnected, planar).
The three most commonly used: adjacency matrices, adjacency lists, and adjacency multilists.
40
ELEMENTRY GRAPH OPERATIONS
Given the root node of a binary tree, one of the most common things one wishes to do
is to traverse the tree and visit the nodes, in some order. Given an undirected graph G+ (V,E)
and a vertex v in V (G) we are two way Depth First Search and Breadth First Search.
Depth First Search of an undirected graph proceeds as follows. The start vertex v is
visited. Next an unvisited vertex w adjacent to v is selected and a depth first search from w
initiated. When a vertex u is reached such that all its adjacent vertices have been visited, we
back up to the last vertex visited which has an unvisited vertex w adjacent to it and initiate a
depth first search from w. The search terminates when no unvisited vertex can be reached
from any of the visited ones. This procedure is best described recursively as in
Procedure DFS(v)
*Given an undirected graph G=( V,E) with n vertices and an arrayVISITED (n) initially set to
zero, this algorithm visits all vertices reachable from v, G and VISITED are global. *
VISITED(V) ← 1
41
end
end DFS
Starting at vertex v and marking it as visited, breadth first search differs from depth
first search in that all unvisited vertices adjacent to v are visited next. Then unvisited vertices
adjacent to these vertices are visited and so on. A breadth first search beginning at vertex v1.
Next vertices v4, v5,v6, and v7 will be visited and finally v8. Algorithm BFS given the
details.
Procedure BFS(v)
*A breadth first search of G is carried out beginning at vertex y. All vertices visited are
marked as VISITED (i) = 1. The graph G and array VISITED are global and VISITED is
initialize Q to be empty *Q is a queue*
loop
call DELETEQ(v,Q)
forever
end BFS
BFS: v1,v2,v3,v4,v5,v6,v7,v8
The cost of a spanning tree of a weighted undirected graph is the sum of the costs
(weights) of the edges in the spanning tree. A minimum cost spanning tree is a spanning tree
of least cost. Three different algorithms can be used to obtain a minimum cost spanning tree
of a connected undirected graph. All three use an algorithm design strategy called the greedy
42
method. We shall refer to the three algorithms as Kruskal‟s, Prim‟s, and Sollin‟s algorithms,
respectively.
For spanning trees, we use a least cost criterion. Our solution must satisfy the
following constraints :
The path matrix P tells us whether or not there are path bit the nodes. Now we want to
find a matrix Q which will tell us the length of the shortest path bit the nodes or more exactly
, a matrix Q = Wij where Wij = length of a shortest path from vi to vj.
Here we define a sequence of matrices Qo Qi… Qm (analog to the above matrix P1,
P2, .. Pm) whose entries are defined as follow.
The initial matrix Q 0 is the same as the weight matrix w except that 0 win W is
replace by D. The final matrix Qm will be desired matrix Q.
Q2(1,3) = a
43
Q3 (4,2) =4
Q4(3,1) =9
Algorithm:
A weighted graph „G‟ with M node is are maintenance in memory by its weight
matrix A. This algorithm finds a matrix Q such that q (I,j) is the length of a shortest path
from node vi to node vj. INFINITY is a very large number and MIN is the minimum value
function.
Q( I,J) = INFINITY
end of loop
3. Repeat step 4
for I = 1,2 …. m;
Q(k,j)] end
end of step
3 end of
step 2 exit.
Kruskal’s Algorithm
Kruskal‟s algorithm builds a minimum cost spanning tree T by adding edges to T one
at a time. The algorithm selects the edges for inclusion in T in nondecreasing order of their
44
cost. An edge is added to T if it does not from a cycle with the edges that are already in T.
Since G is connected and has n>0 vertices, exactly n-1 edges will be selected for inclusion in
T.
We can sort the edges in E in O(e log e) time. Actually, it is not necessary to sort the
edges in E as long as we are able to find the next least cost edge quickly. Obviously a min
heap is ideally suited for this task since we can determine and delete the next least cost edge
in O(log e) time. Construction of the heap itself requires O(e) time.
To check that the new edge, (v, w), does not form a cycle in T and to add such an
edge to T. we may use the union-find operations discussed in Section 5.9. This means that we
view each connected component in T as a set containing the vertices in that component.
Initially, T is empty and each vertex of G is in a different set (see Figure 6.22(b)). Before we
add an edge, (v,w) we use the find operation to determine if v and w are in the same set. If
they are, the two vertices are already connected and adding the edge (v,w) would cause a
cycle. For example, when we consider the edge (3,2) the sets would be {0}, {1, 2, 3}, {5},
{6}. Since vertices 3 and 2 are already in the same set, the edge (3,2) is rejected. The next
edge examined is (1,5) Since vertices 1 and 5 are in different sets, the edge is accepted. This
edge connects the two components {1,2,3} and {5}. Therefore, we perform a union on these
sets to obtain the set {1, 2, 3, 5}.
Since the union-find operations require less time than choosing and deleting an edge
(lines 3 and 4), the latter operations determine the total computing time of Kruskal‟s
algorithm. Thus, the total computing time is O(e log e).
45
Edge Weight Result Figure
-- -- initial Figure 6.22 (b)
(0,5) 10 added to tree Figure 6.22 (c)
(2,3) 12 added Figure 6.22 (d)
(1,6) 14 added Figure 6.22 (e)
(1,2) 16 added Figure 6.22 (f)
(3,6) 18 discarded
(3,4) 22 added Figure 6.22 (g)
(4,6) 24 discarded
46
(4,5) 25 added Figure 6.22 (h)
(0,1) 28 not considered
T={};
while (T contains less than n-1 edges & & E is not empty) {
add (v,w) to T;
else
discard (v,w) ;
Prim‟s Algorithm
Prim‟s algorithm, like Kruskl‟s, constructs the minimum cost spanning tree one edge at a
time. However, at each stage of the algorithm, the set of selected edges forms a tree. By
contrast, the set of selected edges in Kruskal‟s algorithm forms a forest at each stage. Prim‟s
algorithm begins with a tree, T, that contains a single vertex. This may be any of the vertices
in the original graph. Next, we add a least cost edge (u,v) to T such that T { (u, v)} is also
a tree. We repeat this edge addition step until T contain n-1 edges. To make sure that the
added edge does not form a cycle, at each step we choose the edge (u,v) such that exactly one
of u or v is in T. Program 6.8 contains a formal description of Prim‟s algorithm. T is the set
of tree edges, and TV is the set of tree vertices, that is, vertices that are currently in the tree.
Figure 6.24 shows the progress of Prim‟s algorithm on the graph of Figure 6.22(a).
47
T={};
let (u, v) be a least cost edge such that u ∈ TV and cost (near(v), v) is minimum over all
such choices for near(v). (We assume that cost (v, w) = if (v, w) ∉ E). At each stage we
select v so that cost (near (v), v) is minimum and v∉ TV. Using this strategy we can
implement Prim‟s algorithm in O(n 2), where n is the number of vertices in G. Asymptotically
faster implementations are also possible.
We not consider the general case when some or all of the edges of the directed graph G
may have negative length. To see that function shortespath (Program 6.9) does not
necessarily give the correct results on such graphs, consider the graph of Figure 6.29. Let V =
0 be the source vertex. Since n=3, the loop of lines 7 to 14 is iterated just once; u=2 in line 8,
and no changes are made to dist. The function terminates with dist [1] = 7 and dist [2] = 5.
The shortest path from 0 to 2 is 0, 1, 2. This path has length 2, which is less than the
computed value of dist [2].
48
When negative edge lengths are permitted, we require that the graph have no cycles of
negative length. This is necessary so as to ensure that shortest paths consist of a finite number
of edges. For example, consider the graph of Figure 6.30. The length of the shortest path from
vertex 0 to vertex 2 is - as the length of the path.
0, 1, 0, 1, 0, 1, . . ., 0, 1, 2
can be made arbitrarily small. This is so because of the presence of the cycle 0, 1, 0 which
has a length of -1.
(a) Digraph
0 1 2 3 4 5 6 7
0 0
1 300 0
2 1000 800 0
3 1200 0
4 1500 0 250
6 0 1000
7 1700 0
49
Figure : Digraph for Example
When there are no cycles of negative length, there is a shortest path between any two
vertices of an n-vertex graph that has at most n-1 edges on it. To see this, observe that a path
that has more than n-1 edges must repeat at least one vertex and hence must contain a cycle.
Elimination of the cycles from the path results in another path with the same source and
destination. This path is cycle-free and has a length that is no more than that of the original
path, as the length of the eliminated cycles was at least zero. We can use this observation on
the maximum number of edges on a cycle-free shortest path to obtain an algorithm to
determine a shortest path from a source vertex to all remaining vertices in the graph.
As in the case of function shortest Path (Program 6.9) We shall compute only the length,
dist [u], of the shortest path from the source vertex v to u. An exercise examines the
extension needed to construct the shortest paths.
0 2
7 1 -5
-2
0 1 1 1
2
Let dist1 [u] be the length of a shortest path from the source vertex v to vertex u under the
constraint that the shortest path contains at most l edges. Then, dist1 [u] = length [v] [u], 0 < u
< n. As noted earlier, when there are no cycles of negative length, we can limit our search for
shortest paths to paths with at most n-1 edges. Hence, distn-1 [u] is the length of an
unrestricted shortest path from v to u.
50
Our goal then is to compute dist n-1 [u] for all u. This can be done using the dynamic
programming methodology. First, we make the following observations:
k-1
1) If the shortest path from v to u with at most k, k > 1, edges has no more than edges,
then distk[u] = distk-1[u].
2) If the shortest path from v to u with at most k, k>1, edges has exactly k edges, then it
is comprised of a shortest path from v to some vertex j followed by the edge <j, u>.
The path from v to j has k-1 edges, and its length is dist k-1 [j]. All vertices I such that
the edge <I, u> is in the graph are candidates for j. Since we are interested in a
shortest path, the i that minimizes dist k-1 [i] + length [i] [u] is the correct value for j.
This recurrence may be used to compute distk from distk-1 for k=2,3, …..n-1.
In the all-pairs-shortest-path problem we must find the shortest paths between all pairs of
vertices, v1, vj, i j. We could solve this problem using shortestpath with each of the vertices
in V(G) as the source. Since G has n vertices and shortestpath has a time complexity of O(n 2),
the total time required would be O(n3). However, we can obtain a conceptually simpler
algorithm that works correctly even if some edges in G have negative weights. (We do
require that G has no cycles with a negative length). Although this algorithm still has a
computing time of O(n3), it has a smaller constant factor. This new algorithm used the
dynamic programming method.
51
We represent the graph G by its cost adjacency matrix with cost [i] [j] = 0, i =j. If the
edge <I, j>, i j is not in G, we set cost [i] [j] to some sufficiently large number using the
same restrictions discussed in the single source problem. Let A k [i] [j] be the cost of the
shortest path form i to j, using only those intermediate vertices with an index < k. The cost of
the shortest path from i to j is A n-1 [i] [j] as no vertex in G has an index greater than n-1.
Further A-1 [i] [j] = cost [i] [j] since the only i to j paths allowed have no intermediate vertices
on them.
The basic idea in the all pairs algorithm is to begin with the matrix A -1 and successively
generate the matrices A0, A1, A2, …., An-1. If we have already generated A k-1, then we may
generate Ak by realizing that for any pair of vertices i, j one of the two rules below applies.
1) The shortest path from i to j going through no vertex with index greater than k does
not go through the vertex with index k and so its cost is Ak-1 [i] [j]
2) The shortest such path does go through vertex k. Such a path consists of a path from i
to k followed by one from k to j. Neither of these goes through a vertex with index greate
than k-1. Hence, their costs are Ak-1 [i] [j] and Ak-1 [k] [j].
Ak [i ] [j] = min {Ak-1[i] [j], Ak-1 [i] [k] + Ak-1 [k] [j]}, k>0
and
-2 0 1
-2 0 1
5 5 5
11 0
52
void allcosts (int cost [ ] [MAX – VERTICES],
int I, j, k;
for (I = 0 ; i < n ; I + +)
for (j = 0; j < n; j + +)
for (j = 0 ; j < n ; j + +)
Analysis of all costs: This algorithm is especially easy to analyze because the looping is
independent of the data in the distance matrix. The total time for all costs is O(n3)
EXERCISES
1. Let T be a tree with root v. The edges of T are undirected, Each edge in T has a
nonnegative length. Write a C function to determine the length of the shortest paths
from v to the remaining vertices of T. Your function should have complexity O(n),
where n is the number of vertices in T. Show that this is the case.
53
2. Let G be a directed, acyclic graph with n vertices. Assume that the vertices are
numbered 0 through n-1 and that all edges are of the form <i, j>, where i < j. Assume
that the graph is available as a set of adjacency lists and that each edge has a length
(which may be negative) associated with it. Write a C++ function to determine the
length of the shortest paths form vertex 0 to the remaining vertices. The complexity of
your algorithm should be O(n+e), where e is the number of edges in the graph, Show
that this is the case.
9. Using the directed graph of Figure 6.36, explain why shortestpath will not work properly.
What is the shortest path between vertices 0 and 6?
54
Sorting
Motivation
We use the term list to man a collection of records, each record having one or more fields.
The fields used to distinguish among the records are known as keys.
One way to search for a record with the specified key is to examine the lit of records in
left-to-right or right-to-left order. Such a search is known as a sequential search.
The data type of each record is element and each record is assumed to have an integer
field key. Program 7.1 gives a sequential search function that examines the records in the list
a [l :n] in left-to-right order.
int i ;
if (i > n) return 0 ;
return i ;
When all keys are district and a [i] is being searched for, i key comparisons are made. So,
the average number of comparisons for a successful search is
( i) /n=(n+1)/2.
1in
55
It is possible to do much better than this when looking up phone numbers. The fact that
the entries in the list (i.e., the telephone directory) are in lexicographic order (on the name
key) enables one to look up a number while examining only a very few entries in the list.
Binary search (see Chapter 1) is one of the better-known methods for searching an ordered,
sequential list. A binary search takes only O(logn) time to search a list with n records. This is
considerably better than the O(n) time required by a sequential search.
Two important uses of sorting: (1) as an aid in searching and (2) as a means for matching
entries in lists. Sorting also finds application in the solution of many other more complex
problems from areas such as optimization, graph theory and job scheduling. Consequently,
the problem of sorting has great relevance in the study of computing. Unfortunately, no one
sorting method is the best for all applications. We shall therefore study several methods,
indicating when one is superior to the others.
We characterize sorting methods into two broad categories: (1) internal methods (i.e.,
methods to be used when the list to be sorted is small enough so that the entire sort can be
carried out in main memory) and (2) external methods (i.e. methods to be used on larger
lists). The following internal sorting methods will be developed: insertion sort, quick sort,
merge sort, heap sort, and radix sort.
In the recursive formation we divide the list to be sorted into two roughly equal parts
called the left and the right sublists. These sublists are sorted recursively, and the sorted
sublists are merged.
INSERTION (A,N)
PTR = K -1
Return
Quick sort:
Let A be list of n data items. Sorting A refers to the operation of rearranging the elements of
A so that they are in some logical order. This is an algorithm of the divide and conquer
types.
44, 33, 11, 55, 77, 90, 4o, 60, 99, 22, 88, 66
22, 33, 11, 55, 77, 90, 40, 60, 99, 44, 88, 66
22, 33, 11, 44, 77, 90, 40, 60, 99, 55, 88, 66
22, 33, 11,40, 77, 90, 44, 60, 99, 55, 88, 60
Etc., In Q.S. the division in to two sub tiles is made such that the sorted sub files do not
need to be later merged.
Procedure QSORT
\// sort record Rm,…. Rn into no decreasing order on key k, key Km is arbitrarily chose as
the control key pointers I & j are used to partition the sub tile so that any time k, # K, & c1
and k 1 ≥ K,P < J and Ki ≥ K, P > j it is assured that
K m# Ln+1
If M< n n
Then [ I ← j : m← n+1 : K ← Km
Loop
Repeat j ← i+ 1 until Kj ≥ K
Repeat j← I -1 until Kj # K
If I < j
Else exit
For ever
End QSORT
Before looking at the merge sort algorithm to sorts records let us see how one may
merge two fiels (X1,…. Xm) and (Xm+1 …. Xn) that are already sorted to get a third file (
Zi,…., Zn) that is also sorted. Since this merging scheme is very simple, we directly present
the algorithm.
Procedure MERGE(X,1,m,n,Z)
// (X1,….. Xm) and (Xm+1…..............n) are two sorted files with keys X1 # … # Xm +1
#.......# Xn. They are merged to obtain the sorted file (Zi……….Zn) such such
that Z1#. # Zn//
while I # m and j# n do
k←k+1
end
Example -1
Set -1:
6,8,10
Set -2:
3,7, 20
Merge Sort:
3, 6,7,8,10,20
Example – 2
26,5, 77, 1
5,26,1,77
58
1,5,26,77
Example : The input list (26, 5, 77, 1, 61, 11, 59, 15, 49, 19) is to be sorted using the
recursive formulation of merge sort. If the sublist from left ot right is currently to be sorted,
then ists two sblists are indexed form left to [(left + right)/2] and from [(left + right)/2] + 1 to
rights. The sublist partitioning that takes place is described by the binary tree of Figure 7.5.
Note that the sublists begin merged are different from those being merged in mergeSort.
26 5 77 1 61 11 59 15 48 19
5 26 11 59
5 26 77 1 61 11 15 59 19 48
1 5 26 61 77 11 15 19 48 59
1 5 11 15 19 26 48 59 61 77
To eliminate the record copying that takes place when merge (Program 7.7) is used to
merge sorted sublists we associate an integer pointer with each record. For this purpose, we
employ an integer array link [l:n] such that link [i] gives the record that follows record i in the
sorted sublist. In cast link [i] = 0, there is no next record. With the addition of this array of
links, record copying is replaced by link changes and the runtime of our sort function
becomes independent of the size s of a record. Also the additional space required is O(n). By
comparison, the iterative merge sort described earlier takes O(snlogn) time and O(sn)
additional space. On the down side, the use of an array of links yields a sorted chain of
59
records and we must have a follow up process to physically rearrange the records into the
sorted order dictated by the final chain. We describe the algorithm for this physical
rearrangement in section.
We assume that initially link [i] = 0, 1 < i < n. Thus, each record is initially in a chain
containing only itself. Let start 1 and start 2 be pointers to two chains of records. The records
on each chain are in nondecreasing order. Let list Merge (a, link, start1, start2) be a function
that merges two chain that is linked in nondecreasing order of key values. The recursive
version of merge sort is given by function rmerrmerge sort (Program 7.10) (a, link, 1, n). The
start of the chain ordered as described earlier is returned. Function list merge is given in
Program.
sorted chain */
Analysis of rmergeSort : It is easy to se that recursive merge sort is stable, and its computing
time is O(n log n).
60
respectively, are merged; link [0] is used as a
else {
EXERCISES
1. Write the status of the list (12, 2, 16, 30, 8, 28, 4, 10, 20, 6, 18) at the end of each
phase of merge Sort (Program 7.9).
2. Write an iterative natural merge sort function using arrays as in function mergeSort.
How much time does this function take on an initially sorted list? Note that mergeSort
takes O(n log n) on such an input list. What is the worst-case computing time of the
new function? How much additional space is needed?
61
HEAPSORT
Merge sort has a computing time of O(n log n), both in the worst case and as average
behavior, it requires additional storage proportional to the number of records to be sorted.
heap sort, requires only a fixed amount of additional storage and at the same time has as its
worst-case and average computing time O(n log n). However, heap sort is slightly slower
than merge sort.
The deletion and insertion functions associated with max heaps directly yield an O(n log
n) sorting method. The n records are first inserted into an initially empty max heap. Next, the
records are extracted from the max heap one at a time. It is possible to create the max heap of
n records faster than by inserting the records one by one into an initially empty heap. Fro this,
we use the function adjust (Program 7.12) which starts with a binary tree whose left and right
subtrees are max heaps and rearranges records so that the entire birnary tree is a max heap.
The binary tree is embedded within an array using the standard mapping. If the depth of the
tree is d, then the for loop is executed at most d times. Hence the computing time of adjust is
O(d).
To sort the lsit, first we create a max heap by using adjust repeatedly, as in the first for
loop of function heapsort (Program 7.13) Next, we swap the first and last records in the heap.
Since the first record has the maximum key, the swap moves the record with maximum key
into its correct position in the sorted array. We then decrement the heap size and readjust the
heap. This swap, decrement heap size, readjust heap process is called a pass. For example, on
the first pass, we place the record with the highest key in the nth position; on the second pass,
we place the record with the second highest key in position n-1; and on the ith pass, we
palace the record with the second highest key in position n-1; and on the ith pass, we place
the record with the ith highest key in position n-i+1.
Example 7.7 : The input list is (26, 5, 77, 1, 61, 11, 59, 15, 48, 19) If we interpret this list as a
binary tree, we get the tree of Figure 7.7(a) Figure 7.7(b) depicts the max heap
62
element temp;
max. child */
break;
else {
child * =2;
a [child/2] = temp;
after the first for loop of heapsort. Figure 7.8 shows the array of records following each of the
first seven iterations of the second for loop. The portion of the array that still represents a
max heap is shown as a binary treet; the sorted part of the array is shown as an array.
Analysis of heapSort : Suppose 2k-1 < n < 2k, so the tree has k levels and the number of
nodes on level I is < 2i-1. In the first for loop, adjust (Program 7.12) is called once for each
node that has a child. Hence, the time required for this loop is the sum, over each level, of the
number of nodes on a level multiplied by the maximum distance the node can move. This is
no more than
63
2 i1(k i)
2 k i1in i / 2 i
2n O(n)
1i 1ik 1 1ik 1
n
In the next for loop, n-1 applications of adjust are made with maximum tree-depth k =
[log2 (n+1)] and SWAP is invoked n-1 times. Hence, the computing time for this loop is O(n
logn). Consequently, the total computing time is O(n logn).
int I, j ;
element temp ;
adjust (a, i, n) ;
adjust (a, 1, i) ;
64
Figure: Array interpreted as a binary tree
65
Figure : Heap sort example
EXERCISES
1. Write the status of the list (12, 2, 16, 30, 8, 28, 4, 10, 20, 6, 18) at the end of the first
for loop as well as at the end of each iteration of the second for loop of heap sort
(Program 7.13).
2. Heap sort is unstable. Give an example of an input list in which the order of records
with equal keys is not preserved.
66