Unit 1 Ads New
Unit 1 Ads New
Overview of Data Structures - Arrays, Stacks, Queues, linked lists , Linked stacks and Linked
queues, Applications
Linked List
Tree
Graph
Stack, Queue etc.
All these data structures allow us to perform different operations on data. We select these data
structures based on which type of operation is required. We will look into these data structures in
more details in our later lessons.
The data structures can also be classified on the basis of the following characteristics:
Characterstic Description
Linear In Linear data structures,the data items are arranged in a linear sequence.
Example: Array
Homogeneous In homogeneous data structures,all the elements are of same type. Example: Array
Non- In Non-Homogeneous data structure, the elements may or may not be of the same
Homogeneous type. Example: Structures
Static Static data structures are those whose sizes and structures associated memory
locations are fixed, at compile time. Example: Array
Dynamic Dynamic structures are those which expands or shrinks depending upon the
program need and its execution. Also, their associated memory locations changes.
Example: Linked List created using pointers
What is an Algorithm ?
An algorithm is a finite set of instructions or logic, written in order, to accomplish a certain
predefined task. Algorithm is not the complete code or program, it is just the core logic(solution)
of a problem, which can be expressed either as an informal high level description
as pseudocode or using a flowchart.
Every Algorithm must satisfy the following properties:
1. Time Complexity
2. Space Complexity
Space Complexity
Its the amount of memory space required by the algorithm, during the course of its execution.
Space complexity must be taken seriously for multi-user systems and in situations where limited
memory is available.
An algorithm generally requires space for following components :
Instruction Space : Its the space required to store the executable version of the program. This
space is fixed, but varies depending upon the number of lines of code in the program.
Data Space : Its the space required to store all the constants and variables value.
Environment Space : Its the space required to store the environment information needed to
resume the suspended function.
Time Complexity
Time Complexity is a way to represent the amount of time needed by the program to run till its
completion. We will study this in details in later sections.
…………………………………………………………………………………………………...
Arrays:
of differentwe
Whenever variables.
want to As thewith
work number
largeofnumber
variables
of are
dataincreasing,
values, wecomplexity
need to useofthat
the much
program also
number
which we need to work with large number of similar data values. To make this work more easy, C
programming language provides a concept called "Array".
An array is a variable which can store multiple values of same data type at a time.
"Collection of similar data items stored in continuous memory locations with single
name".
int a, b, c;
Here, the compiler allocates 2 bytes of memory with name 'a', another 2 bytes of memory with
name 'b' and more 2 bytes with name 'c'. These three memory locations are may be in sequence or
may not be in sequence. Here these individual variables can store only one value at a time.
int a[3];
Here, the compiler allocates total 6 bytes of continuous memory locations with single name 'a'.
But allows to store three different integer values (each in 2 bytes of memory) at a time. And
memory is organized as follows...
That means all these three memory locations are named as 'a'. But "how can we refer individual
elements?" is the big question. Answer for this question is, compiler not only allocates memory,
but also assigns a numerical value to each individual element of an array. This numerical value is
called as "Index". Index values for the above example are as follows...
The individual elements of an array are identified using the combination of 'name' and 'index' as
follows...
arrayName[indexValue]
For the above example the individual elements can be referred to as follows...
If I want to assign a value to any of these memory locations (array elements), we can assign as
follows...
a[1] = 100;
Sparse Matrix
In computer programming, a matrix can be defined with a 2-dimensional array. Any array with
'm' columns and 'n' rows represents a mXn matrix. There may be a situation in which a matrix
contains more number of ZERO values than NON-ZERO values. Such matrix is known as sparse
matrix.
When a sparse matrix is represented with 2-dimensional array, we waste lot of space to represent
that matrix. For example, consider a matrix of size 100 X 100 containing only 10 non-zero
elements. In this matrix, only 10 spaces are filled with non-zero values and remaining spaces of
matrix are filled with zero. That means, totally we allocate 100 X 100 X 2 = 20000 bytes of space
to store this integer matrix. And to access these 10 non-zero elements we have to make scanning
for 10000 times.
A sparse matrix can be represented by using TWO representations, those are as follows...
1. Triplet Representation
2. Linked Representation
Triplet Representation
In this representation, we consider only non-zero values along with their row and column index
values. In this representation, the 0th row stores total rows, total columns and total non-zero values
in the matrix.
For example, consider a matrix of size 5 X 6 containing 6 number of non-zero values. This matrix
can be represented as shown in the image...
In above example matrix, there are only 6 non-zero elements ( those are 9, 8, 4, 2, 5 & 2) and
matrix size is 5 X 6. We represent this matrix as shown in the above image. Here the first row in
the right side table is filled with values 5, 6 & 6 which indicates that it is a sparse matrix with 5
rows, 6 columns & 6 non-zero values. Second row is filled with 0, 4, & 9 which indicates the
value in the matrix at 0th row, 4th column is 9. In the same way the remaining non-zero values
also follows the similar pattern.
Linked Representation
linked list,representation,
In linked we use two different nodes list
we use linked namely
data header
structurenode and element
to represent node.matrix.
a sparse HeaderInnode
this
consists of three fields and element node consists of five fields as shown in the image...
Consider the above same sparse matrix used in the Triplet representation. This sparse matrix can
be represented using linked representation as shown in the below image...
In above representation, H0, H1,...,H5 indicates the header nodes which are used to represent
indexes. Remaining nodes are used to represent non-zero elements in the matrix, except the very
first node which is used to represent abstract information of the sparse matrix (i.e., It is a matrix
of 5 X 6 with 6 non-zero elements).
In this representation, in each row and column, the last node right field points to it's respective
header node.
Stack
Stack is a linear data structure in which the insertion and deletion operations are performed at
is known
only as "top".
one end. Thatadding
In a stack, means,and
new elementofiselements
removing added ataretop of the stack
performed and position
at single an element is
which
removed from the top of the stack. In stack, the insertion and deletion operations are performed
based on LIFO (Last In First Out) principle.
In a stack, the insertion operation is performed using a function called "push" and deletion
operation is performed using a function called "pop".
In the figure, PUSH and POP operations are performed at top position in the stack. That means,
both the insertion and deletion operations are performed at one end (i.e., at Top)
Stack is a linear data structure in which the operations are performed based on LIFO
principle.
"A Collection of similar data items in which both insertion and deletion operations are
performed based on LIFO principle".
Example
If we want to create a stack by inserting 10,45,12,16,35 and 50. Then 10 becomes the bottom
most element and 50 is the top most element. Top is at 50 as shown in the image below...
The following operations are performed on the stack...
Stack data structure can be implement in two ways. They are as follows...
1. Using Array
2. Using Linked List
When stack is implemented using array, that stack can organize only limited number of elements.
When stack is implemented using linked list, that stack can organize unlimited number of
elements.
A stack data structure can be implemented using one dimensional array. But stack implemented
using array, can store only fixed number of data values. This implementation is very simple, just
define a one dimensional array of specific size and insert or delete the values into that array by
using LIFO principle with the help of a variable 'top'. Initially top is set to -1. Whenever we
want to insert a value into the stack, increment the top value by one and then insert. Whenever we
want to delete a value from the stack, then delete the top value and decrement the top value by
one.
Before implementing actual operations, first follow the below steps to create an empty stack.
Step 1: Include all the header files which are used in the program and define a
constant 'SIZE' with specific value.
Step 2: Declare all the functions used in stack implementation.
Step 3: Create a one dimensional array with fixed size (int stack[SIZE])
Step 4: Define a integer variable 'top' and initialize with '-1'. (int top = -1)
Step 5: In main method display menu with list of operations and make suitable function
calls to perform operation selected by the user on the stack.
In a stack, push() is a function used to insert an element into the stack. In a stack, the new element
is always inserted at topposition. Push function takes one integer value as parameter and inserts
that value into the stack. We can use the following steps to push an element on to the stack...
In a stack, pop() is a function used to delete an element from the stack. In a stack, the element is
always deleted from topposition. Pop function does not take any value as parameter. We can use
the following steps to pop an element from the stack...
Queue ADT
Queue is a linear data structure in which the insertion and deletion operations are performed at
two different ends. In a queue data structure, adding and removing of elements are performed at
two different positions. The insertion is performed at one end and deletion is performed at other
end. In a queue data structure, the insertion operation is performed at a position which is known
as 'rear' and the deletion operation is performed at a position which is known as 'front'. In queue
data structure, the insertion and deletion operations are performed based on FIFO (First In First
Out) principle.
In a queue data structure, the insertion operation is performed using a function called
"enQueue()" and deletion operation is performed using a function called "deQueue()".
Queue data structure is a linear data structure in which the operations are performed
based on FIFO principle.
"Queue data structure is a collection of similar data items in which insertion and deletion
operations are performed based on FIFO principle".
Example
Queue data structure can be implemented in two ways. They are as follows...
1. Using Array
2. Using Linked List
When a queue is implemented using array, that queue can organize only limited number of
elements. When a queue is implemented using linked list, that queue can organize unlimited
number of elements.
When we want to work with unknown number of data values, we use a linked list data structure to
that each that
organize element
data.links to its
Linked listnext
is aelement in the
linear data sequence.
structure thatEach element
contains in a linked
sequence list is called
of elements such
as "Node".
Simply a list is a sequence of data, and linked list is a sequence of data linked with each other.
Single linked list is a sequence of elements in which every element has link to its next
element in the sequence.
In any single linked list, the individual element is called as "Node". Every "Node" contains two
fields, data and next. The data field is used to store actual value of that node and next field is
used to store the address of the next node in the sequence.
NOTE:
☀ In a single linked list, the address of the first node is always stored in a reference node known
as "front" (Some times it is also known as "head").
☀ Always next part (reference part) of the last node must be NULL.
Example
In a single linked list we perform the following operations...
1. Insertion
2. Deletion
3. Display
Before we implement actual operations, first we need to setup empty list. First perform the
following steps before implementing actual operations.
Step 1: Include all the header files which are used in the program.
Step 2: Declare all the user defined functions.
Step 3: Define a Node structure with two members data and next
Step 4: Define a Node pointer 'head' and set it to NULL.
Step 4: Implement the main method by displaying operations menu and make suitable
function calls in the main method to perform user selected operation.
Insertion
In a single linked list, the insertion operation can be performed in three ways. They are as
follows...
We can use the following steps to insert a new node at beginning of the single linked list...
We can use the following steps to insert a new node at end of the single linked list...
Step 1: Create a newNode with given value and newNode → next as NULL.
Step 2: Check whether list is Empty (head == NULL).
Step 3: If it is Empty then, set head = newNode.
Step 4: If it is Not Empty then, define a node pointer temp and initialize with head.
Step 5: Keep moving the temp to its next node until it reaches to the last node in the list
(until temp → next is equal to NULL).
Step 6: Set temp → next = newNode.
We can use the following steps to insert a new node after a node in the single linked list...
Step 7: Finally, Set 'newNode → next = temp → next' and 'temp → next = newNode'
Deletion
In a single linked list, the deletion operation can be performed in three ways. They are as
follows...
We can use the following steps to delete a node from beginning of the single linked list...
We can use the following steps to delete a node from end of the single linked list...
We can use the following steps to delete a specific node from the single linked list...
We can use the following steps to display the elements of a single linked list...
In single linked list, every node points to its next node in the sequence and the last node points
NULL. But in circular linked list, every node points to its next node in the sequence but the last
node points to the first node in the list.
Circular linked list is a sequence of elements in which every element has link to its next
element in the sequence and the last element has a link to the first element in the sequence.
That means circular linked list is similar to the single linked list except that the last node points to
the first node in the list
Example
In a circular linked list, we perform the following operations...
1. Insertion
2. Deletion
3. Display
Before we implement actual operations, first we need to setup empty list. First perform the
following steps before implementing actual operations.
Step 1: Include all the header files which are used in the program.
Step 2: Declare all the user defined functions.
Step 3: Define a Node structure with two members data and next
Step 4: Define a Node pointer 'head' and set it to NULL.
Step 4: Implement the main method by displaying operations menu and make suitable
function calls in the main method to perform user selected operation.
Insertion
In a circular linked list, the insertion operation can be performed in three ways. They are as
follows...
We can use the following steps to insert a new node at beginning of the circular linked list...
We can use the following steps to insert a new node at end of the circular linked list...
We can use the following steps to insert a new node after a node in the circular linked list...
In a circular linked list, the deletion operation can be performed in three ways those are as
follows...
We can use the following steps to delete a node from beginning of the circular linked list...
Step 1: Check whether list is Empty (head == NULL)
Step 2: If it is Empty then, display 'List is Empty!!! Deletion is not possible' and
terminate the function.
Step 3: If it is Not Empty then, define two Node pointers 'temp1' and 'temp2' and
initialize both 'temp1' and 'temp2' with head.
Step 4: Check whether list is having only one node (temp1 → next == head)
Step 5: If it is TRUE then set head = NULL and delete temp1 (Setting Empty list
conditions)
Step 6: If it is FALSE move the temp1 until it reaches to the last node. (until temp1 →
next == head )
Step 7: Then set head = temp2 → next, temp1 → next = head and delete temp2.
We can use the following steps to delete a node from end of the circular linked list...
We can use the following steps to delete a specific node from the circular linked list...
We can use the following steps to display the elements of a circular linked list...
kind of problem by using double linked list. Double linked list can be defined as follows...
Double linked list is a sequence of elements in which every element has links to its previous
element and next element in the sequence.
In double linked list, every node has link to its previous node and next node. So, we can traverse
forward by using next field and can traverse backward by using previous field. Every node in a
double linked list contains three fields and they are shown in the following figure...
Here, 'link1' field is used to store the address of the previous node in the sequence, 'link2' field is
used to store the address of the next node in the sequence and 'data' field is used to store the
actual value of that node.
Example
☀ In double linked list, the first node must be always pointed by head.
☀ Always the previous field of the first node must be NULL.
☀ Always the next field of the last node must be NULL.
Operations
1. Insertion
2. Deletion
3. Display
Insertion
In a double linked list, the insertion operation can be performed in three ways as follows...
We can use the following steps to insert a new node at beginning of the double linked list...
Step 1: Create a newNode with given value and newNode → previous as NULL.
Step 2: Check whether list is Empty (head == NULL)
Step 3: If it is Empty then, assign NULL to newNode → next and newNode to head.
Step 4: If it is not Empty then, assign head to newNode → next and newNode to head.
We can use the following steps to insert a new node at end of the double linked list...
Step 1: Create a newNode with given value and newNode → next as NULL.
Step 2: Check whether list is Empty (head == NULL)
Step 3: If it is Empty, then assign NULL to newNode →
previous and newNode to head.
Step 4: If it is not Empty, then, define a node pointer temp and initialize with head.
Step 5: Keep moving the temp to its next node until it reaches to the last node in the list
(until temp → next is equal to NULL).
Step 6: Assign newNode to temp → next and temp to newNode → previous.
Inserting At Specific location in the list (After a Node)
We can use the following steps to insert a new node after a node in the double linked list...
Step 6: Every time check whether temp1 is reached to the last node. If it is reached to the
last node then display 'Given node is not found in the list!!! Insertion not
possible!!!' and terminate the function. Otherwise move the temp1 to next node.
Step 7: Assign temp1 → next to temp2, newNode to temp1 → next, temp1 to newNode
→ previous, temp2 to newNode → next and newNode to temp2 → previous.
Deletion
In a double linked list, the deletion operation can be performed in three ways as follows...
We can use the following steps to delete a node from beginning of the double linked list...
We can use the following steps to delete a node from end of the double linked list...
Step 6: If it is FALSE, then keep moving temp until it reaches to the last node in the list.
(until temp → next is equal to NULL)
Step 7: Assign NULL to temp → previous → next and delete temp.
We can use the following steps to delete a specific node from the double linked list...
We can use the following steps to display the elements of a double linked list...
The major problem with the stack implemented using array is, it works only for fixed number of
data values. That means the amount of data must be specified at the beginning of the
implementation itself. Stack implemented using array is not suitable, when we don't know the size
of data which we are going to use. A stack data structure can be implemented by using linked list
data structure. The stack implemented using linked list can work for unlimited number of values.
That means, stack implemented using linked list works for variable size of data. So, there is no
need to fix the size at the beginning of the implementation. The Stack implemented using linked
list can organize as many data values as we want.
In linked list implementation of a stack, every new element is inserted as 'top' element. That
means every newly inserted element is pointed by 'top'. Whenever we want to remove an element
from the stack, simply remove the node which is pointed by 'top' by moving 'top' to its next node
in the list. The next field of the first element must be always NULL.
Example
In above example, the last inserted node is 99 and the first inserted node is 25. The order of
elements inserted is 25, 32,50 and 99.
Operations
To implement stack using linked list, we need to set the following things before implementing
actual operations.
Step 1: Include all the header files which are used in the program. And declare all
the user defined functions.
Step 2: Define a 'Node' structure with two members data and next.
Step 3: Define a Node pointer 'top' and set it to NULL.
Step 4: Implement the main method by displaying Menu with list of operations and make
suitable function calls in the mainmethod.
We can use the following steps to insert a new node into the stack...
We can use the following steps to delete a node from the stack...
We can use the following steps to display the elements (nodes) of a stack...
The major problem with the queue implemented using array is, It will work for only fixed number
of data. That means, the amount of data must be specified in the beginning itself. Queue using
array is not suitable when we don't know the size of data which we are going to use. A queue data
structure can be implemented using linked list data structure. The queue which is implemented
using linked list can work for unlimited number of values. That means, queue using linked list can
work for variable size of data (No need to fix the size at beginning of the implementation). The
Queue implemented using linked list can organize as many data values as we want.
In linked list implementation of a queue, the last inserted node is always pointed by 'rear' and the
first node is always pointed by 'front'.
Example
In above example, the last inserted node is 50 and it is pointed by 'rear' and the first inserted node
is 10 and it is pointed by 'front'. The order of elements inserted is 10, 15, 22 and 50.
Operations
To implement queue using linked list, we need to set the following things before implementing
actual operations.
Step 1: Include all the header files which are used in the program. And declare all
the user defined functions.
Step 2: Define a 'Node' structure with two members data and next.
Step 3: Define two Node pointers 'front' and 'rear' and set both to NULL.
Step 4: Implement the main method by displaying Menu of list of operations and make
suitable function calls in the mainmethod to perform user selected operation.
We can use the following steps to insert a new node into the queue...
Step 1: Create a newNode with given value and set 'newNode → next' to NULL.
Step 2: Check whether queue is Empty (rear == NULL)
Step 3: If it is Empty then, set front = newNode and rear = newNode.
Step 4: If it is Not Empty then, set rear → next = newNode and rear = newNode.
We can use the following steps to delete a node from the queue...
We can use the following steps to display the elements (nodes) of a queue...
Step 1: Check whether queue is Empty (front == NULL).
Step 2: If it is Empty then, display 'Queue is Empty!!!' and terminate the function.
Step 3: If it is Not Empty then, define a Node pointer 'temp' and initialize with front.
Step 4: Display 'temp → data --->' and move it to the next node. Repeat the same until
'temp' reaches to 'rear' (temp → next != NULL).
Step 4: Finally! Display 'temp → data ---> NULL'.
APPLICATIONS
To store a set of programs which are to be given access to a hard disk according to
their priority.
b) For representing a city region telephone network.
c) To store a set of fixed key words which are referenced very frequently.
d) To represent an image in the form of a bitmap.
e) To implement back functionality in the internet browser.
f) To store dynamically growing data which is accessed very frequently, based upon a
key value.
g) To implement printer spooler so that jobs can be printed in the order of their
arrival.
h) To record the sequence of all the pages browsed in one session.
i) To implement the undo function in a text editor.
j) To store information about the directories and files in a system.
Algorithm Analysis
Efficiency of an algorithm can be analyzed at two different stages, before implementation and
after implementation. They are the following −
A Priori Analysis − This is a theoretical analysis of an algorithm. Efficiency of an
algorithm is measured by assuming that all other factors, for example, processor speed,
are constant and have no effect on the implementation.
A Posterior Analysis − This is an empirical analysis of an algorithm. The selected
algorithm is implemented using programming language. This is then executed on target
computer machine. In this analysis, actual statistics like running time and space required,
are collected.
We shall learn about a priori algorithm analysis. Algorithm analysis deals with the execution or
running time of various operations involved. The running time of an operation can be defined as
the number of computer instructions executed per operation.
Algorithm Complexity
Suppose X is an algorithm and n is the size of input data, the time and space used by the
algorithm X are the two main factors, which decide the efficiency of X.
Time Factor − Time is measured by counting the number of key operations such as
comparisons in the sorting algorithm.
Space Factor − Space is measured by counting the maximum memory space required by
the algorithm.
The complexity of an algorithm f(n) gives the running time and/or the storage space required by
the algorithm in terms of n as the size of input data.
Space Complexity
Space complexity of an algorithm represents the amount of memory space required by the
algorithm in its life cycle. The space required by an algorithm is equal to the sum of the
following two components −
A fixed part that is a space required to store certain data and variables, that are
independent of the size of the problem. For example, simple variables and constants used,
program size, etc.
A variable part is a space required by variables, whose size depends on the size of the
problem. For example, dynamic memory allocation, recursion stack space, etc.
Space complexity S(P) of any algorithm P is S(P) = C + SP(I), where C is the fixed part and S(I)
is the variable part of the algorithm, which depends on instance characteristic I. Following is a
simple example that tries to explain the concept −
Algorithm: SUM(A, B)
Step 1 - START
Step 2 - C ← A + B + 10
Step 3 - Stop
Here we have three variables A, B, and C and one constant. Hence S(P) = 1 + 3. Now, space
depends on data types of given variables and constant types and it will be multiplied accordingly.
Time Complexity
Time complexity of an algorithm represents the amount of time required by the algorithm to run
to completion. Time requirements can be defined as a numerical function T(n), where T(n) can
be measured as the number of steps, provided each step consumes constant time.
For example, addition of two n-bit integers takes n steps. Consequently, the total computational
time is T(n) = c ∗ n, where c is the time taken for the addition of two bits. Here, we observe that
T(n) grows linearly as the input size increases.
Asymptotic analysis is input bound i.e., if there's no input to the algorithm, it is concluded to
work in a constant time. Other than the "input" all other factors are considered constant.
Asymptotic analysis refers to computing the running time of any operation in mathematical units
of computation. For example, the running time of one operation is computed as f(n) and may be
for another operation it is computed as g(n2). This means the first operation running time will
increase linearly with the increase in n and the running time of the second operation will increase
exponentially when n increases. Similarly, the running time of both operations will be nearly the
same if n is significantly small.
Following are the commonly used asymptotic notations to calculate the running time complexity
of an algorithm.
Ο Notation
Ω Notation
θ Notation
Big Oh Notation, Ο
The notation Ο(n) is the formal way to express the upper bound of an algorithm's running time. It
measures the worst case time complexity or the longest amount of time an algorithm can possibly
take to complete.
Ο(f(n)) = { g(n) : there exists c > 0 and n0 such that f(n) ≤ c.g(n) for all n > n0. }
Omega Notation, Ω
The notation Ω(n) is the formal way to express the lower bound of an algorithm's running time. It
measures the best case time complexity or the best amount of time an algorithm can possibly take
to complete.
For example, for a function f(n)
Ω(f(n)) ≥ { g(n) : there exists c > 0 and n0 such that g(n) ≤ c.f(n) for all n > n0. }
Theta Notation, θ
The notation θ(n) is the formal way to express both the lower bound and the upper bound of an
algorithm's running time. It is represented as follows −
θ(f(n)) = { g(n) if and only if g(n) = Ο(f(n)) and g(n) = Ω(f(n)) for all n > n0. }
logarithmic − Ο(log n)
linear − Ο(n)
quadratic − Ο(n2)
cubic − Ο(n3)
polynomial − nΟ(1)
exponential − 2Ο(n)
Time complexity :
Big O notation
f(n) = O(g(n)) means
There are positive constants c and k such that:
0<= f(n) <= c*g(n) for all n >= k.
For large problem sizes the dominant term(one with highest value of exponent) almost completely
determines the value of the complexity expression. So abstract complexity is expressed in terms
of the dominant term for large N. Multiplicative constants are also ignored.
N^2 + 3N + 4 is O(N^2)
since for N>4, N^2 + 3N + 4 < 2N^2 (c=2 & k=4)
Examples:
1. Traversing an array.
2. Sequential/Linear search in an array.
3. Best case time complexity of Bubble sort (i.e when the elements of array are in sorted order).
Basic strucure is :
for (i = 0; i < N; i++) {
sequence of statements of O(1)
}
The loop executes N times, so the total time is N*O(1) which is O(N).
Nested loops:
1.
for (i = 0; i < N; i++)
{ for (j = 0; j < M; j++)
{
sequence of statements of O(1)
}
}
The outer loop executes N times and inner loop executes M times so the time complexity is
O(N*M)
2.
for (i = 0; i < N; i++)
{ for (j = 0; j < N; j++)
{
sequence of statements of O(1)
}
}
Now the time complexity is O(N^2)
3. let's consider nested loops where the number of iterations of the inner loop depends on the
value of the outer loop's index.
Examples:
1. MergeSort, QuickSort etc.
Using the "<" sign informally, we can say that the order of growth is
O(l) < O(log n) < O(n) < O(n log n) < O(n^2) < O(n^3) < O(a^n) where a>1
Example:
{ O(1)
perform any statement S1}
for (i=0; i < n; i++) {
has complexity (N2). The loop executes N times and each method call g(N) is complexity O(N).
Other Examples
A recursive solution:
public long Fib1(long n){
if ((n == 1 )||(n==2)) return 1;
return Fib1(n-1) + Fib1(n-2);
}
A better solution:
Solve F1, F2, ..., Fn. Solve them in order and save their values!
Function Fib2(n){
Create an array fib[1....n]
fib[1] = 1
fib[2] = 1
for i = 3 to n:
fib[i] = fib[i-1] + fib[i-2]
return fib[n]
}
The time complexity of this algorithm is O(n). The number of steps required is proportional to n.
F(200) is now reasonable so is F(2000) and F(20000).
Tower of Hanoi
The Tower of Hanoi is a mathematical puzzle. It consists of three rods, and a number of disks of
different sizes which can slide onto any rod. The puzzle starts with the disks neatly stacked in
order of size on one rod, the smallest at the top, thus making a conical shape.
The objective of the puzzle is to move the entire stack to another rod, obeying the following rules:
• Only one disk may be moved at a time.
• another rod,
Each
on topmove
of theconsists of taking
other disks the upper
that may disk
already be from oneonofthat
present the rod.
rodsNo
anddisk
sliding
mayit onto
be placed on top of a smaller disk.
•
We want to write a recursive method, THanoi(n,A,B,C) which moves n disks from peg A to peg C
using peg B for intermediate transfers.
Stopping condition: n = 1
The time complexity of above algorithm can be determined using following recurrence relation.
Let T(n) be the number of steps required to solve the puzzle for n disks. It is clearly evident from
the above observation that the soluiton for n disks is equivalent to- solving the puzzle two times for
n-1 disks and a single step involving transfer of disk from starting 'peg' to final 'peg' which takes
constant time.
Thus,
T(n) = T(n-1) + T(n-1) + O(1)= 2*T(n-1) + O(1)
The solution to this recurrence relation is exponential in n and so T(n) is of exponential order. The
proof is out of scope of this course.
This is an example where recursion is much easier to formulate than a loop based solution.
Polynomial vs. Exponential Running Time:
The time complexity(generally referred as running time) of an algorithm is expressed as the amount of
time taken by an algorithm for some size of the input to the problem. Big O notation is commonly
used to express the time complexity of any algorithm as this suppresses the lower order terms and is
described asymptotically. Time complexity is estimated by counting the operations(provided as
instructions in a program) performed in an algorithm. Here each operation takes a fixed amount of time in
execution. Generally time complexities are classified as constant, linear, logarithmic, polynomial,
exponential etc. Among these the polynomial and exponential are the most prominently
considered and defines the complexity of an algorithm. These two parameters for any algorithm are
always influenced by size of input.
An algorithm is said to be solvable in polynomial time if the number of steps required to complete the
algorithm for a given input is O(nk) for some non-negative integer k, where n is the complexity of the
input. Polynomial-time algorithms are said to be "fast." Most familiar mathematical operations such as
addition, subtraction, multiplication, and division, as well as computing square roots, powers, and
logarithms, can be performed in polynomial time. Computing the digits of most interesting mathematical
constants, including pi and e, can also be done in polynomial time.
All basic arithmetic operations ((i.e.) Addition, subtraction, multiplication,division), comparison
operations, sort operations are considered as polynomial time algorithms.
The set of problems which can be solved by an exponential time algorithms, but for which no
polynomial time algorithms is known. An algorithm is said to be exponential time, if T(n) is upper
bounded by 2poly(n), where poly(n) is some polynomial in n. More formally, an algorithm is exponential
time if T(n) is bounded by O(2nk) for some constant k.
Algorithms which have exponential time complexity grow much faster than polynomial algorithms.
The difference you are probably looking for happens to be where the variable is in the equation that
expresses the run time. Equations that show a polynomial time complexity have variables in the bases of
their terms. Examples: n3 + 2n2+ 1. Notice n is in the base, NOT the exponent. In exponential equations,
the variable is in the exponent. Examples: 2n. As said before, exponential time grows much faster. If n is
equal to 1000 (a reasonable input for an algorithm), then notice 10003 is 1 billion, and 21000 is simply
huge! For a reference, there are about 280 hydrogen atoms in the sun, this is much more than 1 billion.
Average, Best, and Worst Case Complexities:
Worst Case:
In the worst case analysis, we calculate upper bound on running time of an algorithm. We must know the
case that causes maximum number of operations to be executed. For Linear Search, the worst case
happens when the element to be searched (x in the above code) is not present in the array. When x is not
present, the search() functions compares it with all the elements of arr[] one by one. Therefore, the worst
case time complexity of linear search would be Θ(n).
Average Case:
In average case analysis, we take all possible inputs and calculate computing time for all of the inputs.
Sum all the calculated values and divide the sum by total number of inputs. We must know (or predict)
distribution of cases. For the linear search problem, let us assume that all cases are uniformly
distributed (including the case of x not being present in array). So we sum all the cases and divide the
sum by (n+1). Following is the value of average case time complexity.
=
= Θ(n)
Best Case:
In the best case analysis, we calculate lower bound on running time of an algorithm. We must know the
case that causes minimum number of operations to be executed. In the linear search problem, the best
case occurs when x is present at the first location. The number of operations in the best case is constant
(not dependent on n). So time complexity in the best case would be Θ(1).
Most of the times, we do worst case analysis to analyze algorithms. In the worst analysis, we guarantee
an upper bound on the running time of an algorithm which is good information.
The average case analysis is not easy to do in most of the practical cases and it is rarely done. In the
average case analysis, we must know (or predict) the mathematical distribution of all possibleinputs.
The Best Case analysis is bogus. Guaranteeing a lower bound on an algorithm doesn’t provide any
information as in the worst case, an algorithm may take years to run.
For some algorithms, all the cases are asymptotically same, i.e., there are no worst and best cases.
For example, Merge Sort. Merge Sort does Θ(nLogn) operations in all cases. Most of the other
sorting algorithms have worst and best cases. For example, in the typical implementation of Quick Sort
(where pivot is chosen as a corner element), the worst occurs when the input array is already sorted and
the best occur when the pivot elements always divide array in two halves. For insertion sort, the worst
case occurs when the array is reverse sorted and the best case occurs when the array is sorted in the same
order as output.
Example: Factorial
n! = 1•2•3...n and 0! = 1 (called initial case)
So the recursive defintiion n! = n•(n-1)!
Algorithm F(n)
if n = 0 then return 1 // base case
else F(n-1)•n // recursive call
M(n) ε Θ(2n)
Terrible. Can we do it
better?
Where did the exponential term come from? Because two recursive calls are made. Suppose three
recursive calls are made, what is the order of growth.
Lesson learned: Be careful of the recursive algorithm, they can grow exponential.
Especial if the problem size is measured by the level of the recursive tree and the operation count is total
number of nodes.
Example: Binary Representation
Algorithm BinRec(n)
if n = 1 then return 1
else return BinRec(floor(n/2)) + 1
1. Problem size is n
2. Basic operation is the addition in the recursive call
3. There is no difference between worst and best case
4. Recursive relation including initial conditions
A(n) = A(floor(n/2)) + 1
IC A(1) = 0
5. Solve recursive relation
The division and floor function in the argument of the recursive call makes the analysis difficult.
We could make the variable substitution, n = 2k, could get rid of the definition,
but the substitution skips a lot of values for n.
The smoothness rule (see appendix B) says that is ok.
Smoothness rule
T(n) eventually non-decreasing and f(n) be smooth {eventually non-decreasing and f(2n) ε Θ(f(n))}
if T(n) ε Θ(f(n)) for n powers of b then T(n) ε Θ(f(n)) for all n.
In general solution to the inhomogeneous problem is equal to the sum of solution to homogenous
problem plus solution only to the inhomogeneous part. The undetermined coefficients of the solution for
the homogenous problem are used to satisfy the IC.
We guess at I(n) and then determine the new IC for the homogenous problem for B(n)
There is the Master Theorem that give the asymptotic limit for many common problems.
Iterative algorithm
Algorithm Fib(n)
F[1] ← 0; F[1] ← 1
for i ← 2 to n do
F[i] ← F[i-1] + F[i-2]
return F[n]