13C- DESIGN AND ANALYSIS OF ALGORITHMS
Unit I
Introduction: Algorithm specification - performance analysis: Space Complexity - Time
complexity - Asymptotic Notation - Practical Complexities - Performance Measurement.
Elementary Data Structures: Stacks and Queues, Trees, Dictionaries, Sets and disjoint Set,
Union, Graphs.
Definition:
Algorithm is a step-by-step procedure, which defines a set of instructions to be executed in a
certain order to get the desired output.
Characteristics of an Algorithm
An algorithm should have the following characteristics −
Input − An algorithm should have 0 or more well-defined inputs.
Output − An algorithm should have 1 or more well-defined outputs, and should match the
desired output.
Definiteness: each instruction must be clear and unambiguous;
1
Finiteness: if we trace out the instructions of an algorithm, then for all cases the algorithm
will terminate after a finite number of steps;
Effectiveness: every instruction must be sufficiently basic that it can in principle be
carried out by a person using only pencil and paper. It is not enough that each operation
be definite, but it must also be feasible.
4 Distinct areas of study of algorithms:
1. How to devise algorithms.
Techniques – Divide & Conquer, Branch and Bound , Dynamic Programming
2. How to validate algorithms.
Check for Algorithm that it computes the correct answer for all possible legal inputs.
3. How to analyze algorithms.
Analysis of Algorithms or performance analysis refer to the task of determining how
much computing time & storage an algorithm requires
4. How to test a program : 2 phases
Debugging - the process of executing programs on sample data sets to determine whether
faulty results occur and, if so, to correct them.
Profiling or performance measurement is the process of executing a correct program on
data sets and measuring the time and space it takes to compute the results
Algorithm Specification:
Algorithm can be described in three ways:
1. Natural language like English:
2. Graphic representation called flowchart: This method will work well when the algorithm is
small& simple.
3. Pseudo-code Method: In this method, we should typically describe algorithms as program,
Pseudo-Code Conventions:
1. Comments begin with // and continue until the end of line.
2. Blocks are indicated with matching braces {and}.
3. An identifier begins with a letter. The data types of variables are not explicitly declared.
4. Compound data types can be formed with records. Here is an example,
Node. Record
{
2
data type – 1 data-1;
.
.
. data
type – n data – n;
node * link;
}
Here link is a pointer to the record type node. Individual data items of a record can be accessed
with and period.
5. Assignment of values to variables is done using the assignment statement.
<Variable>:= <expression>;
6. There are two Boolean values TRUE and FALSE
7. The following looping statements are employed. For, while and repeat until While Loop:
8. A conditional statement has the following forms.
If then If then Else
9. Input and output are done using the instructions read & write.
10. There is only one type of procedure:
Algorithm, the heading takes the form, Algorithm Name (Parameter lists)
Example: The following algorithm fields & returns the maximum of ‘n’ given numbers:
1. algorithm Max(A,n)
2. // A is an array of size n
3. {
4. Result := A[1];
5. for I:= 2 to n do
6. if A[I] > Result then
7. Result :=A[I];
8. return Result;
9. }
In this algorithm (named Max), A & n are procedure parameters. Result & I are Local variables.
3
Algorithm –Performance Analysis
Efficiency of an algorithm can be analyzed at two different stages, before implementation
and after implementation. They are the following −
A Priori Analysis − This is a theoretical analysis of an algorithm. Efficiency of an
algorithm is measured by assuming that all other factors, for example, processor speed,
are constant and have no effect on the implementation.
A Posterior Analysis − This is an empirical analysis of an algorithm. The selected
algorithm is implemented using programming language. This is then executed on target
computer machine. In this analysis, actual statistics like running time and space required,
are collected.
Algorithm Complexity
Suppose X is an algorithm and n is the size of input data, the time and space used by the
algorithm X are the two main factors, which decide the efficiency of X.
Time Factor − Time is measured by counting the number of key operations such as
comparisons in the sorting algorithm.
Space Factor − Space is measured by counting the maximum memory space required by
the algorithm.
The complexity of an algorithm f(n) gives the running time and/or the storage space required by
the algorithm in terms of n as the size of input data.
4
1. Space Complexity
The amount of memory space required by the algorithm in its life cycle.
The space required by an algorithm is equal to the sum of the following two components :
A fixed part that is a space required to store certain data and variables, that are
independent of the size of the problem. For example, simple variables and constants used,
program size, etc.
A variable part is a space required by variables, whose size depends on the size of the
problem. For example, dynamic memory allocation, recursion stack space, etc.
Space complexity S(P) of any algorithm P is S(P) = C + SP(I), where C is the fixed part and S(I)
is the variable part of the algorithm, which depends on instance characteristic .
2.Time Complexity
The amount of time required by the algorithm to run to completion.
Time requirements can be defined as a numerical function T(n), where T(n) can be
measured as the number of steps, provided each step consumes constant time.
5
Asymptotic analysis:
Expressing the complexity in term of its relationship to know function.
Asymptotic Notations:
Following are the commonly used asymptotic notations to calculate the running time
complexity of an algorithm.
Ο Notation
Ω Notation
θ Notation
Big ‘oh’: The formal way to express the upper bound of an algorithm's running time.
The function f(n)=O(g(n)) iff there exist positive constants c and no such that f(n)≤c*g(n) for all
n, n ≥ no.
6
Omega: the formal way to express the lower bound of an algorithm's running time. It measures
the best case time complexity or the best amount of time an algorithm can possibly take to
complete.
The function f(n)=Ω(g(n)) iff there exist positive constants c and no such that f(n) ≥ c*g(n) for
all n, n ≥ no.
Theta: Specifies the average complexity.
The function f(n)=ө(g(n)) iff there exist positive constants c1,c2 and no such that c1 g(n) ≤ f(n) ≤
c2 g(n) for all n, n ≥ no.
Big-O Notation (Ο) –specifically describes the worst-case scenario.
Omega Notation (Ω) –specifically describes the best-case scenario.
Theta Notation (θ) – This notation represents the average complexity of an algorithm.
Common Asymptotic Notations
Following is a list of some common asymptotic notations –
constant − Ο(1)
logarithmic − Ο(log n)
linear − Ο(n)
n log n − Ο(n log n)
quadratic − Ο(n2)
cubic − Ο(n3)
The most used notation in the analysis of a code is the Big O Notation which gives an upper
bound of the running time of the code (or the amount of memory used in terms of input
7
Recursion:
Recursion may have the following definitions:
-The nested repetition of identical algorithm is recursion.
-It is a technique of defining an object/process by itself.
-Recursion is a process by which a function calls itself repeatedly until some specified condition
has been satisfied.
Elementary Data Structures:
A data structure is defined as a particular way of storing and organizing data in our devices to use
the data efficiently and effectively. The main idea behind using data structures is to minimize the
time and space complexities. An efficient data structure takes minimum memory space and
requires minimum time to execute the data.
Stack:
A stack is an Abstract Data Type (ADT), that is popularly used in most programming languages.
It is named stack because it has the similar operations as the real-world stacks, for example – a
pack of cards or a pile of plates, etc.
The stack follows the LIFO (Last in - First out) structure where the last element inserted would
be the first element deleted.
Stack Representation:
A Stack ADT allows all data operations at one end only. At any given time, we can only access
the top element of a stack.
The following diagram depicts a stack and its operations −
8
Basic Operations
Stack operations may involve initializing the stack, using it and then de-initializing it.
Create()
Apart from these basic stuffs, a stack is used for the following two primary operations −
push() − Pushing (storing) an element on the stack.
pop() − Removing (accessing) an element from the stack.
To use a stack efficiently, we need to check the status of stack as well. For the same purpose, the
following functionality is added to stacks −
isFull() − check if stack is full.
isEmpty() − check if stack is empty.
Queues:
It is an abstract data structure, somewhat similar to Stacks. Unlike stacks, a queue is open at
both its ends. One end is always used to insert data (enqueue) and the other is used to remove
data (dequeue). Queue follows First-In-First-Out methodology, i.e., the data item stored first
will be accessed first.
Representation of Queues
9
Similar to the stack ADT, a queue ADT can also be implemented using arrays, linked lists, or
pointers. As a small example in this tutorial, we implement queues using a one-dimensional
array.
Basic Operations :
Queue operations may involve initializing or defining the queue, utilizing it, and then
completely erasing it from the memory.
Queue uses two pointers − front and rear. The front pointer accesses the data from the front end
(helping in enqueueing) while the rear pointer accesses data from the rear end (helping in
dequeuing).
Basic operations associated with queues −
Create()
enqueue() − add (store) an item to the queue.
dequeue() − remove (access) an item from the queue
isfull() − Checks if the queue is full.
isempty() − Checks if the queue is empty.
Graphs:
Graph is a non linear data structure, it contains a set of points known as nodes (or
vertices) and set of linkes known as edges (or Arcs) which connets the vertices.
Generally, a graph G is represented as G = ( V , E ), where V is set of vertices and E is set of
edges.
Example
The following is a graph with 5 vertices and 6 edges.
This graph G can be defined as G = ( V , E )
Where V = {A,B,C,D,E} and E = {(A,B),(A,C)(A,D),(B,D),(C,D),(B,E),(E,D)}.
10
Graph Representations
Adjacency Matrix
Adjacency List
Incidence Matrix
Adjacency Matrix
In this representation, graph can be represented using a matrix of size total number of vertices by
total number of vertices. That means if a graph with 4 vertices can be represented using a matrix
of 4X4 class. In this matrix, rows and columns both represents vertices. This matrix is filled with
either 1 or 0. Here, 1 represents there is an edge from row vertex to column vertex and 0
represents there is no edge from row vertex to column vertex.
For example, consider the following undirected graph representation...
Directed graph representation...
Adjacency List
In this representation, every vertex of graph contains list of its adjacent vertices.
11
For example, consider the directed graph representation implemented linked
list
Incidence Matrix
In this representation, the graph is represented using a matrix of size total number of vertices by
a total number of edges. That means graph with 4 vertices and 6 edges is represented using a
matrix of size 4X6.
In this matrix, rows represent vertices and columns represents edges. This matrix is filled with 0
or 1 or -1. Here, 0 represents that the row edge is not connected to column vertex, 1 represents
that the row edge is connected as the outgoing edge to column vertex and -1 represents that the
row edge is connected as the incoming edge to column vertex.
For example, consider the following directed graph representation...
Dictionary: A Dictionary is a collection of pairs of form (k, e), where k is a key and e is the
element associated with the key k (equivalently, e is the element whose key is k). No two pairs in
a dictionary should have the same key.
12
Examples:
1. A word dictionary is a collection of elements; each element comprises a word, which is the
key and the meaning of the word, pronunciation are associated with the word.
2. A telephone directory with a collection of telephone numbers and names.
3. The list of students enrolled for the data structures course, compiler and Operating System
course. The list contains (CourseName, RollID) pairs.
The above last two examples are dictionaries, which permit two or more (key,element) pairs
having the same key.
Operations on Dictionaries:
For example, get(k) returns the element with the key k.
with a specified key into the dictionary.
For example, put(k, e) puts the element e whose key is k into the dictionary.
For example, remove(k) removes the element with key k.
Disjoint Sets:
Two sets are called disjoint sets if they don’t have any element in common, the intersection of
sets is a null set.
A data structure that stores non overlapping or disjoint subset of elements is called disjoint set
data structure. The disjoint set data structure supports following operations:
1. Adding new sets to the disjoint set.
2. Merging disjoint sets to a single disjoint set using Union operation.
3. Finding representative of a disjoint set using Find operation.
4. Check if two sets are disjoint or not.
Disjoint set Union:
If Si and Sj are to disjoint sets, then their union Si U Sj consists of all the elements x such that x
is in Si or Sj.
Example: S1={1,7,8,9} S2={2,5,10}
Then S1 U S2= {1,2,5,7,8,9,10}
Find: Given the element I, find the set containing i.
Example: S1={1,7,8,9} Then, S2={2,5,10} s3={3,4,6}
Find(4)= S3 , Find(5)=S2 , Find(7)=S1
******************************************************************************
13