UNIT-I
Algorithm:
An Algorithm is a finite sequence of instructions, each of which has a clear
meaning and can be
performed with a finite amount of effort in a finite length of time. No matter
what the input values may
be, an algorithm terminates after executing a finite number of instructions. In
addition, every algorithm
must satisfy the following criteria:
 Input: there are zero or more quantities, which are externally supplied;
 Output: at least one quantity is produced
 Definiteness: each instruction must be clear and unambiguous;
 Finiteness: if we trace out the instructions of an algorithm, then for all cases
the algorithm willterminate after a finite number of steps;
 Effectiveness: every instruction must be sufficiently basic that it can in
principle be carried out by a person using only pencil and paper. It is not enough
that each operation be definite, but it must also be feasible.
In formal computer science, one distinguishes between an algorithm, and a
program. A program does notnecessarily satisfy the fourth condition. One
important example of such a program for a computer is itsoperating system,
which never terminates (except for system crashes) but continues in a wait loop
untilmore jobs are entered.
We represent algorithm using a pseudo language that is a combination of the
constructs of a programming language together with informal English
statements
                                 UNIT-I
Pseudo code for expressing algorithms:
Algorithm Specification: Algorithm can be described in three ways.
1. Natural language like English: When this way is choused care should be
taken, we shouldensure that each & every statement is definite.
2. Graphic representation called flowchart: This method will work well when
the algorithmis small simple.
3. Pseudo-code Method: In this method, we should typically describe algorithms
as program,which resembles language like Pascal &Algol.
Pseudo-Code Conventions:
1. Comments begin with // and continue until the end ofline.
2. Blocks are indicated with matching braces{and}.
3. An identifier begins with a letter. The data types of variables are not explicitly
declared.
4. Compound data types can be formed with records. Here is anexample,
Node.Record
{
data type – 1data-1;
.
.
.
data type – n data – n;
node * link;
}
Here link is a pointer to the record type node. Individual data items of a record
can
be accessed with  and period.
5. Assignment of values to variables is done using the assignmentstatement.
<Variable>:= <expression>;
                                UNIT-I
6. There are two Boolean values TRUE andFALSE.
 LogicalOperators AND, OR,NOT
Relational Operators <, <=,>,>=, =,!=
7. The following looping statements areemployed.
For, while and repeat-until
While Loop:
While < condition > do
{
<statement-1>
.
.
.
<statement-n>
}
For Loop:
For variable: = value-1 to value-2 step step do
{
<statement-1>
.
.
.
<statement-n>
}
repeat-until:
repeat
<statement-1>
.
                                UNIT-I
.
.
<statement-n>
until<condition>
8. A conditional statement has the followingforms.
 If <condition> then <statement>
 If <condition> then <statement-1>
Else <statement-1>
Case statement:
Case
{
}
: <condition-1>:<statement-1>
.
.
.
: <condition-n>:<statement-n>
: else: <statement-n+1>
9. Input and output are done using the instructions read &write.
10. There is only one type of procedure:
Algorithm, the heading takes the form,
Algorithm <Name> (<Parameter lists>)
 As an example, the following algorithm fields & returns the maximum of „n‟
given
numbers:
1. Algorithmic(A,n)
2. // A is an array of size n
3. {
                                   UNIT-I
4. Result :=A[1];
5. for I:= 2 to ndo
6. if A[I] > Result then
7. Result:=A[I];
8. return Result;
9. }
In this algorithm (named Max), A & n are procedure parameters. Result & I are
Local
variables.
Algorithm:
1. Algorithm selection sort(a,n)
2. // Sort the array a[1:n] into non-decreasing
order.3.{
4. for I:=1 to ndo
5. {
6. j:=I;
4
7. for k:=i+1 to ndo
8. if (a[k]<a[j])
9. t:=a[I];
10. a[I]:=a[j];
11. a[j]:=t;
12.        }
13. }
                                 UNIT-I
Performance Analysis:
The performance of a program is the amount of computer memory and time
needed to run a program. We use two approaches to determine the performance
of a program. One is analytical, and the other experimental. In performance
analysis we use analytical methods, while in performance measurement we
conduct experiments.
Time Complexity:
The time needed by an algorithm expressed as a function of the size of a
problem is called the time complexity of the algorithm. The time complexity of
a program is the amount of computer time it needs to run to completion.
The limiting behaviour of the complexity as size increases is called the
asymptotic time complexity. It is the asymptotic complexity of an algorithm,
which ultimately determines the size of problems that can be solved by the
algorithm.
The Running time of a program
When solving a problem we are faced with a choice among algorithms. The
basis for this
can be any one of the following:
i. We would like an algorithm that is easy to understand codeand debug.
ii. We would like an algorithm that makes efficient use of the computer‟s
resources, especially, one that runs as fast aspossible.
Measuring the running time of a program
The running time of a program depends on factors such as:
1. The input to the program.
2. The quality of code generated by the compiler used to create the object
program.
3. The nature and speed of the instructions on the machine used to execute the
program,
                                UNIT-I
4. The time complexity of the algorithm underlying the program
The total time will be 2n+3
Space Complexity:
The space complexity of a program is the amount of memory it needs to run to
completion. The space need by a program has the following components:
Instruction space: Instruction space is the space needed to store the compiled
version of the program instructions.
Data space: Data space is the space needed to store all constant and variable
values. Data space has two components:
 Space needed by constants and simple variables in program.
 Space needed by dynamically allocated objects such as arrays and class
instances.
Environment stack space: The environment stack is used to save information
needed to resume execution of partially completed functions.
Instruction Space: The amount of instructions space that is needed depends on
factors such as:
 The compiler used to complete the program into machine code.
 The compiler options in effect at the time of compilation
 The target computer.
The space requirement s(p) of any algorithm p may therefore be written as,
                                    UNIT-I
S(P) = c+ Sp(Instance characteristics)
Where „c‟ is a constant.
Example 2:
Algorithm sum(a,n)
{
s=0.0;
for I=1 to n do
s= s+a[I];
return s;
}
 The problem instances for this algorithm are characterized by n,the number
of
elements to be summed. The space needed d by „n‟ is one word, since it is of
type
integer.
 The space needed by „a‟a is the space needed by variables of tyepe array
offloating
point numbers.
 This is atleast „n‟ words, since „a‟ must be large enough to hold the „n‟
elements tobe
summed.
 So,we obtain Ssum(n)>=(n+s)
[ n for a[],one each for n,I a&s]
Complexity of Algorithms
The complexity of an algorithm M is the function f(n) which gives the running
time
and/or storage space requirement of the algorithm in terms of the size „n‟ of the
input
data. Mostly, the storage space required by an algorithm is simply a multiple of
the data
                                  UNIT-I
size „n‟. Complexity shall refer to the running time ofthealgorithm.
The function f(n), gives the running time of an algorithm, depends not only on
the
size „n‟ of the input data but also on the particular data. The complexity
function f(n) for
certain casesare:
1. Best Case : The minimum possible value of f(n) is called the bestcase.
2. Average Case : The expected value off(n).
3. Worst Case : The maximum value of f(n) for any key possibleinput.
Asymptotic Notations:
The following notations are commonly use notations in performance analysis
and
used to characterize the complexity of an algorithm:
1. Big–OH(O)
2. Big–OMEGA(Ω),
3. Big–THETA (Θ)and
4. Little–OH(o)
Big–OH O (Upper Bound)
f(n) = O(g(n)), (pronounced order of or big oh), says that the growth rate of f(n)
is less
than or equal (<) that of g(n).
                                UNIT-I
Big–OMEGA Ω (Lower Bound)
f(n) = Ω (g(n)) (pronounced omega), says that the growth rate of f(n) is greater
than or
equal to (>) that of g(n)
Big–THETA Θ (Same order)
f(n) = Θ (g(n)) (pronounced theta), says that the growth rate of f(n) equals (=)
the
growth rate of g(n) [if f(n) = O(g(n)) and T(n) = Θ (g(n)]
                                UNIT-I
little-o notation
Definition: A theoretical measure of the execution of an algorithm, usually the
time or memory needed,
given the problem size n, which is usually the number of items. Informally,
saying some equation f(n) =
o(g(n)) means f(n) becomes insignificant relative to g(n) as n approaches
infinity. The notation is read, "f
of n is little oh of g of n".
Formal Definition: f(n) = o(g(n)) means for all c > 0 there exists some k > 0
such that 0 ≤ f(n) < cg(n) for
all n ≥ k. The value of k must not depend on n, but may depend on c.
Different time complexities
Suppose „M‟ is an algorithm, and suppose „n‟ is the size of the input data.
Clearly
the complexity f(n) of M increases as n increases. It is usually the rate of
increase of
f(n) we want to examine. This is usually done by comparing f(n) with some
standard
                                 UNIT-I
functions. The most common computing times are:
O(1), O(log2 n), O(n), O(n. log2 n), O(n2), O(n3), O(2n), n! and nn
Classification of Algorithms
If „n‟ is the number of data items to be processed or degree of polynomial or
the size of
the file to be sorted or searched or the number of nodes in a graph etc.
1 Next instructions of most programs are executed once or at most only a few
times. If all the instructions of a program have this property, we say that its
running time is aconstant.
Logn           When the running time of a program is logarithmic, the program
gets
slightly slower as n grows. This running time commonly occurs in
programs that solve a big problem by transforming it into a smaller
problem, cutting the size by some constant fraction., When n is a million,
log n is a doubled. Whenever n doubles, log n increases by a constant, but
log n does not double until n increases ton2
.
N               When the runningtime of a program is linear, it is generally the
case that asmall amount of processing is done on each input element. This is the
optimal situation for an algorithm that must process n inputs.
n log n     This running time arises for algorithms that solve a problem by
breaking itup into smaller sub-problems, solving then independently, and then
combining the solutions. When n doubles, the running time more than
doubles.
n2         When the running time of an algorithm is quadratic, it is practical for
use
                                 UNIT-I
only on relatively small problems. Quadratic running times typically arise
in algorithms that process all pairs of data items (perhaps in a double nested
loop) whenever n doubles, the running time increases fourfold.
n3         Similarly, an algorithm that process triples of data items (perhaps in a
triple–nested loop) has a cubic running time and is practical for use only on
small problems. Whenever n doubles, the running time increases eightfold.
2n             Few algorithms with exponential running time are likely to be
appropriatefor practical use, such algorithms arise naturally as “brute–force”
solutionsto problems. Whenever n doubles, the runningtimesquares.
Randomized algorithms:
An algorithm that uses random numbers to decide what to do next anywhere in
its logic is calledRandomized Algorithm. For example, in Randomized Quick
Sort, we use random number to pick the nextpivot (or we randomly shuffle the
array). Quicksort is a familiar, commonly used algorithm in whichrandomness
can be useful. Any deterministic version of this algorithm requires O(n2) time
tosort n numbers for some well-defined class of degenerate inputs (such as an
                                 UNIT-I
already sorted array), with thespecific class of inputs that generate this behavior
defined by the protocol for pivot selection. However, ifthe algorithm selects
pivot elements uniformly at random, it has a provably high probability of
finishingin O(n log n) time regardless of the characteristics of the input.
Typically, this randomness is used toreduce time complexity or space
complexity in other standardalgorithm.
UNIT-I
UNIT-I
                               UNIT-I
Sets and Disjoint Set Union:
                                  UNIT-I
Disjoint Set Union: Considering a set S={1,2,3…10} (when n=10), then
elements
can be partitioned into three disjoint sets s1={1,7,8,9},s2={2,5,10} and
s3={3,4,6}. Possible tree representations are:
In this representation each set is represented as a tree. Nodes are linked from the
child to parent rather than usual method of linking from parent to child.
The operations on these sets are:
1. Disjoint setunion
2. Find(i)
3. MinOperation
4. Delete
5. Intersect
1. Disjoint Set union:
If Si and Sj are two disjoint sets, then their union Si U Sj = all the elements x
such that x is in Si or Sj. Thus S1 U S2 ={1,7,8,9,2,5,10}.
2. Find(i):
Given the element I, find the set containing i. Thus, 4 is in set S3, 9 is in S1.
UNION operation:
Union(i,j) requires two tree with roots i and j be joined. S1 U S2 is
obtained by making any one of the sets as sub tree of other.
                                   UNIT-I
Simple Algorithm for Union:
Algorithm Union(i,j)
{
//replace the disjoint sets with roots i and j, I not equal to j by their
union Integer i,j;
P[j] :=i;
}
Example:
Implement following sequence of operations Union(1,3),Union(2,5),Union(1,2)
Solution:
Initially parent array contains zeros.
                                   UNIT-I
Process the following sequence of union
operations Union(1,2),Union(2,3)Union(n-1,n)
Degenerate Tree:
The time taken for n-1 unions is O(n).
Find(i) operation: determines the root of the tree containing
element i. Simple Algorithm for Find:
Algorithm Find(i)
{
j:=i;
while(p[j]>0) do
j:=p[j]; return j;
}
Find Operation: Find(i) implies that it finds the root node of ith node, in other
words it
returns the name of the set i.
Example: Consider the Union(1,3)
Find(1)=0
Find(3)=1, since its parent is 1. (i.e, root is 1)
                                UNIT-I
Weighting Rule for Union(i,j)
                               UNIT-I
Spanning Trees:
A spanning tree is a subset of GraphG, which has all the vertices covered with
minimum possible number
of edges. Hence, a spanning tree does not have cycles and it cannot be
disconnected..
                               UNIT-I
By this definition, we can draw a conclusion that every connected and
undirected Graph G has at
least one spanning tree. A disconnected graph does not have any spanning tree,
as it cannot be
spanned to all its vertices
We found three spanning trees off one complete graph. A complete undirected
graph can have maximum n n-2 number of spanning trees, where n is the
number of nodes. In the above addressed
example, n is 3, hence 3
3−2 = 3spanning trees are possible.
General Properties of Spanning Tree
We now understand that one graph can have more than one spanning tree.
Following are a few
properties of the spanning tree connected to graph G −
 A connected graph G can have more than one spanningtree.
 All possible spanning trees of graph G, have the same number of edges and
vertices.
 The spanning tree does not have any cycle(loops).
 Removing one edge from the spanning tree will make the graph
disconnected, i.e. the spanning tree is minimallyconnected.
                                     UNIT-I
 Adding one edge to the spanning tree will create a circuit or loop, i.e. the
spanning tree is maximally
acyclic.
Mathematical Properties of Spanning Tree
 Spanning tree has n-1 edges, where n is the number of nodes(vertices).
 From a complete graph, by removing maximum e - n + 1 edges, we can
construct a spanningtree.
 A complete graph can have maximum n
n-2 number of spanningtrees.
Thus, we can conclude that spanning trees are a subset of connected Graph G
and disconnected
graphs do not have spanning tree.
Application of Spanning Tree
Spanning tree is basically used to find a minimum path to connect all nodes in a
graph. Common
application of spanning trees are −
 Civil Network Planning
 Computer Network Routing Protocol
ClusterAnalysis
GRAPHS:
A Graph is a non-linear data structure consisting of vertices and edges. The vertices are
sometimes also referred to as nodes and the edges are lines or arcs that connect any two
nodes in the graph. More formally a Graph is composed of a set of vertices( V ) and a set of
edges( E ). The graph is denoted by G(V, E).
Representations of Graph
Here are the two most common ways to represent a graph : For simplicity, we are going to
consider only unweighted graphs in this post.
1. Adjacency Matrix
2. Adjacency List
                                      UNIT-I
Adjacency Matrix Representation
An adjacency matrix is a way of representing a graph as a matrix of boolean (0's and 1's)
Let's assume there are n vertices in the graph So, create a 2D matrix adjMat[n][n] having
dimension n x n.
   If there is an edge from vertex i to j, mark adjMat[i][j] as 1.
   If there is no edge from vertex i to j, mark adjMat[i][j] as 0.
Representation of Undirected Graph as Adjacency Matrix:
The below figure shows an undirected graph. Initially, the entire Matrix is initialized to 0.
If there is an edge from source to destination, we insert 1 to both cases (adjMat[source]
[destination] and adjMat[destination][source]) because we can go either way.
Adjacency Matrix is a square matrix used to represent a finite graph by storing the
relationships between the nodes in their respective cells. For a graph with V vertices, the
adjacency matrix A is an V X V matrix or 2D array.
1. Adjacency Matrix for Undirected and Unweighted graph:
Consider an Undirected and Unweighted graph G with 4 vertices and 3 edges. For the
graph G, the adjacency matrix would look like:
                                      UNIT-I
Here's how to interpret the matrix:
 A[i][j] = 1, there is an edge between vertex i and vertex j.
 A[i][j] = 0, there is NO edge between vertex i and vertex j.
Adjacency Matrix for Directed and Unweighted graph:
Consider an Directed and Unweighted graph G with 4 vertices and 4 edges. For the graph
G, the adjacency matrix would look like:
Adjacency List Representation
An array of Lists is used to store edges between two vertices. The size of array is equal to
the number of vertices (i.e, n). Each index in this array represents a specific vertex in the
graph. The entry at the index i of the array contains a linked list containing the vertices that
are adjacent to vertex i. Let's assume there are n vertices in the graph So, create an array of
list of size n as adjList[n].
   adjList[0] will have all the nodes which are connected (neighbour) to vertex 0.
   adjList[1] will have all the nodes which are connected (neighbour) to vertex 1 and so
    on.
                                       UNIT-I
Representation of Undirected Graph as Adjacency list:
The below undirected graph has 3 vertices. So, an array of list will be created of size 3,
where each indices represent the vertices. Now, vertex 0 has two neighbours (i.e, 1 and 2).
So, insert vertex 1 and 2 at indices 0 of array. Similarly, For vertex 1, it has two neighbour
(i.e, 2 and 0) So, insert vertices 2 and 0 at indices 1 of array. Similarly, for vertex 2, insert
its neighbours in array of list.