DAA (Unit 3)
DAA (Unit 3)
Traversing the graph means examining all the nodes and vertices of the graph. There
are two standard methods by using which, we can traverse the graphs. Lets discuss
each one of them in detail.
The algorithm of breadth first search is given below. The algorithm starts with
examining the node A and all of its neighbours. In the next step, the neighbours of the
nearest node of A are explored and process continues in the further steps. The
algorithm explores all neighbours of all the nodes and ensures that each node is visited
exactly once and no node is visited twice.
Algorithm
o Step 1: SET STATUS = 1 (ready state)
for each node in G
o Step 2: Enqueue the starting node A
and set its STATUS = 2
(waiting state)
o Step 3: Repeat Steps 4 and 5 until
QUEUE is empty
o Step 4: Dequeue a node N. Process it
and set its STATUS = 3
(processed state).
o Step 5: Enqueue all the neighbours of
N that are in the ready state
(whose STATUS = 1) and set
their STATUS = 2
(waiting state)
[END OF LOOP]
o Step 6: EXIT
Example
Consider the graph G shown in the following image, calculate the minimum path p from
node A to node E. Given that each edge has a length of 1.
Solution:
Minimum Path P can be found by applying breadth first search algorithm that will begin
at node A and will end at E. the algorithm uses two queues,
namely QUEUE1 and QUEUE2. QUEUE1 holds all the nodes that are to be processed
while QUEUE2 holds all the nodes that are processed and deleted from QUEUE1.
Lets start examining the graph from Node A.
2. Delete the Node A from QUEUE1 and insert all its neighbours. Insert Node A into
QUEUE2
3. Delete the node B from QUEUE1 and insert all its neighbours. Insert node B into
QUEUE2.
1. QUEUE1 = {D, C, F}
2. QUEUE2 = {A, B}
4. Delete the node D from QUEUE1 and insert all its neighbours. Since F is the only
neighbour of it which has been inserted, we will not insert it again. Insert node D into
QUEUE2.
1. QUEUE1 = {C, F}
2. QUEUE2 = { A, B, D}
5. Delete the node C from QUEUE1 and insert all its neighbours. Add node C to
QUEUE2.
1. QUEUE1 = {F, E, G}
2. QUEUE2 = {A, B, D, C}
6. Remove F from QUEUE1 and add all its neighbours. Since all of its neighbours has
already been added, we will not add them again. Add node F to QUEUE2.
1. QUEUE1 = {E, G}
2. QUEUE2 = {A, B, D, C, F}
7. Remove E from QUEUE1, all of E's neighbours has already been added to QUEUE1
therefore we will not add them again. All the nodes are visited and the target node i.e.
E is encountered into QUEUE2.
1. QUEUE1 = {G}
2. QUEUE2 = {A, B, D, C, F, E}
The data structure which is being used in DFS is stack. The process is similar to BFS
algorithm. In DFS, the edges that leads to an unvisited node are called discovery edges
while the edges that leads to an already visited node are called block edges.
Algorithm
o Step 1: SET STATUS = 1 (ready state) for each node in G
o Step 2: Push the starting node A on the stack and set its STATUS = 2 (waiting
state)
o Step 3: Repeat Steps 4 and 5 until STACK is empty
o Step 4: Pop the top node N. Process it and set its STATUS = 3 (processed state)
o Step 5: Push on the stack all the neighbours of N that are in the ready state
(whose STATUS = 1) and set their
STATUS = 2 (waiting state)
[END OF LOOP]
o Step 6: EXIT
Example :
Consider the graph G along with its adjacency list, given in the figure below. Calculate
the order to print all the nodes of the graph starting from node H, by using depth first
search (DFS) algorithm.
Solution :
Push H onto the stack
1. STACK : H
POP the top element of the stack i.e. H, print it and push all the neighbours of H onto
the stack that are is ready state.
1. Print H
2. STACK : A
Pop the top element of the stack i.e. A, print it and push all the neighbours of A onto
the stack that are in ready state.
1. Print A
2. Stack : B, D
Pop the top element of the stack i.e. D, print it and push all the neighbours of D onto
the stack that are in ready state.
1. Print D
2. Stack : B, F
Pop the top element of the stack i.e. F, print it and push all the neighbours of F onto the
stack that are in ready state.
1. Print F
2. Stack : B
Pop the top of the stack i.e. B and push all the neighbours
1. Print B
2. Stack : C
Pop the top of the stack i.e. C and push all the neighbours.
1. Print C
2. Stack : E, G
Pop the top of the stack i.e. G and push all its neighbours.
1. Print G
2. Stack : E
Pop the top of the stack i.e. E and push all its neighbours.
1. Print E
2. Stack :
Hence, the stack now becomes empty and all the nodes of the graph have been
traversed.
1. H → A → D → F → B → C → G → E
TOPOLOGICAL SORT:
The topological sorting for a directed acyclic graph is the linear ordering of vertices. For every edge
U-V of a directed graph, the vertex u will come before vertex v in the ordering.
As we know that the source vertex will come after the destination vertex, so we need to use a stack
to store previous elements. After completing all nodes, we can simply display them from the stack.
Input and Output
Input:
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 1 0 0
0 1 0 0 0 0
1 1 0 0 0 0
1 0 1 0 0 0
Output:
Nodes after topological sorted order: 5 4 2 3 1 0
Algorithm
topoSort(u, visited, stack)
Input: The start vertex u, An array to keep track of which node is visited or not. A stack to store
nodes.
Output: Sorting the vertices in topological sequence in the stack.
Begin
mark u as visited
for all vertices v which is adjacent with u, do
if v is not visited, then
topoSort(c, visited, stack)
done
Initial graph
The strongly connected components of the above graph are:
Kosaraju's Algorithm
Let us start from vertex-0, visit all of its child vertices, and mark the visited vertices as
done. If a vertex leads to an already visited vertex, then push this vertex to the stack.
For example: Starting from vertex-0, go to vertex-1, vertex-2, and then to vertex-3.
Vertex-3 leads to already visited vertex-0, so push the source vertex (ie. vertex-3) into the
Stack.
Stacking
Similarly, a final stack is created
. Final Stack
2. Reverse the original graph
Start from the top vertex of the stack. Traverse through all of its child vertices. Once the
already visited vertex is reached, one strongly connected component is formed.
For example: Pop vertex-0 from the stack. Starting from vertex-0, traverse through its
child vertices (vertex-0, vertex-1, vertex-2, vertex-3 in sequence) and mark them as
visited. The child of vertex-3 is already visited, so these visited vertices form one strongly
connected component.
4. Start from the top and traverse through all the vertices
Go to the stack and pop the top vertex if already visited. Otherwise, choose the top vertex
from the stack and traverse through its child vertices as presented above.
5. Pop the top vertex if already visited
Minimum spanning tree has direct application in the design of networks. It is used in algorithms
approximating the travelling salesman problem, multi-terminal minimum cut problem and minimum-
cost weighted perfect matching. Other practical applications are:
1. Cluster Analysis
2. Handwriting recognition
3. Image segmentation
There are two famous algorithms for finding the Minimum Spanning Tree:
Kruskal’s Algorithm
Kruskal’s Algorithm builds the spanning tree by adding edges one by one into a growing spanning
tree. Kruskal's algorithm follows greedy approach as in each iteration it finds an edge which has
least weight and add it to the growing spanning tree.
Algorithm Steps:
This could be done using DFS which starts from the first vertex, then check if the second vertex is
visited or not. But DFS will make time complexity large as it has an order of O(V+E) where V is the
number of vertices, E is the number of edges. So the best solution is "Disjoint Sets":
Disjoint sets are sets whose intersection is the empty set so it means that they don't have any
element in common.
1. After that we will select the second lowest weighted edge i.e., edge with weight
2. Notice these two edges are totally disjoint. Now, the next edge will be the third lowest weighted
edge i.e., edge with weight
3, which connects the two disjoint pieces of the graph. Now, we are not allowed to pick the edge
with weight
4, that will create a cycle and we can’t have any cycles. So we will select the fifth lowest weighted
edge i.e., edge with weight
5. Now the other two edges will create cycles so we will ignore them. In the end, we end up with a
minimum spanning tree with total cost 11 ( = 1 + 2 + 3 + 5).
Time Complexity:
In Kruskal’s algorithm, most time consuming operation is sorting because the total complexity of the
Disjoint-Set operations will be O(ElogV), which is the overall Time Complexity of the algorithm.
Prim’s Algorithm
Prim’s Algorithm also use Greedy approach to find the minimum spanning tree. In Prim’s Algorithm
we grow the spanning tree from a starting position. Unlike an edge in Kruskal's, we add vertex to the
growing spanning tree in Prim's.
Algorithm Steps:
Maintain two disjoint sets of vertices. One containing vertices that are in the growing spanning
tree and other that are not in the growing spanning tree.
Select the cheapest vertex that is connected to the growing spanning tree and is not in the
growing spanning tree and add it into the growing spanning tree. This can be done using
Priority Queues. Insert the vertices, that are connected to growing spanning tree, into the
Priority Queue.
Check for cycles. To do that, mark the nodes which have been already selected and insert
only those nodes in the Priority Queue that are not marked.
1. In the next iteration we have three options, edges with weight 2, 3 and 4. So, we will select
the edge with weight
2. and mark the vertex. Now again we have three options, edges with weight 3, 4 and 5. But we
can’t choose edge with weight
3. as it is creating a cycle. So we will select the edge with weight
4. and we end up with the minimum spanning tree of total cost 7 ( = 1 + 2 +4).
Time Complexity:
The time complexity of the Prim’s Algorithm is O((V+E)logV) because each vertex is inserted in the
priority queue only once and insertion in priority queue take logarithmic time.
We define the shortest - path weight from u to v by δ(u,v) = min (w (p): u→v), if there
is a path from u to v, and δ(u,v)= ∞, otherwise.
The shortest path from vertex s to vertex t is then defined as any path p with weight
w (p) = δ(s,t).
The breadth-first- search algorithm is the shortest path algorithm that works on
unweighted graphs, that is, graphs in which each edge can be considered to have unit
weight.
In a Single Source Shortest Paths Problem, we are given a Graph G = (V, E), we
want to find the shortest path from a given source vertex s ∈ V to every vertex v ∈ V.
Variants:
There are some variants of the shortest path problem.
Relaxation
The single - source shortest paths are based on a technique known as relaxation, a
method that repeatedly decreases an upper bound on the actual shortest path weight of
each vertex until the upper bound equivalent the shortest - path weight. For each
vertex v ∈ V, we maintain an attribute d [v], which is an upper bound on the weight of
the shortest path from source s to v. We call d [v] the shortest path estimate.
After initialization, π [v] = NIL for all v ∈ V, d [v] = 0 for v = s, and d [v] = ∞ for v ∈ V
- {s}.
The development of relaxing an edge (u, v) consists of testing whether we can improve
the shortest path to v found so far by going through u and if so, updating d [v] and π
[v]. A relaxation step may decrease the value of the shortest - path estimate d [v] and
updated v's predecessor field π [v].
Fig: Relaxing an edge (u, v) with weight w (u, v) = 2. The shortest-path estimate of
each vertex appears within the vertex.
(a) Because v. d > u. d + w (u, v) prior to relaxation, the value of v. d decreases
(b) Here, v. d < u. d + w (u, v) before relaxing the edge, and so the relaxation
step leaves v. d unchanged.
RELAX (u, v, w)
1. If d [v] > d [u] + w (u, v)
2. then d [v] ← d [u] + w (u, v)
3. π [v] ← u
Dijkstra's Algorithm:
Dijkstra's algorithm allows us to find the shortest path between any two vertices of a
graph.
Dijkstra's Algorithm differs from minimum spanning tree because the shortest distance
between two vertices might not include all the vertices of the graph.
D between vertices A and D is also the shortest path between vertices B and D.
Djikstra used this property in the opposite direction i.e we overestimate the distance of
each vertex from the starting vertex. Then we visit each node and its neighbours to find
the shortest subpath to those neighbours.
The algorithm uses a greedy approach in the sense that we find the next best solution
hoping that the end result is the best solution for the whole problem.
Dijkstra's Algorithm (G, w, s)
1. INITIALIZE - SINGLE - SOURCE (G, s)
2. S←∅
3. Q←V [G]
4. while Q ≠ ∅
5. do u ← EXTRACT - MIN (Q)
6. S ← S ∪ {u}
7. for each vertex v ∈ Adj [u]
8. do RELAX (u, v, w)
In telephone network
It is similar to Dijkstra's algorithm but it can work with graphs in which edges can have
negative weights.
By doing this repeatedly for all vertices, we can guarantee that the result is optimized.
BELLMAN -FORD (G, w, s)
1. INITIALIZE - SINGLE - SOURCE (G, s)
2. for i ← 1 to |V[G]| - 1
3. do for each edge (u, v) ∈ E [G]
4. do RELAX (u, v, w)
5. for each edge (u, v) ∈ E [G]
6. do if d [v] > d [u] + w (u, v)
7. then return FALSE.
8. return TRUE.
Space Complexity
The running time of this data is determined by line 1 and by the for loop of lines 3 - 5.
The topological sort can be implemented in ∅ (V + E) time. In the for loop of lines 3 - 5,
as in Dijkstra's algorithm, there is one repetition per vertex. For each vertex, the edges
that leave the vertex are each examined exactly once. Unlike Dijkstra's algorithm, we
use only O (1) time per edge. The running time is thus ∅ (V + E), which is linear in the
size of an adjacency list depiction of the graph.
Example:
Step1: To topologically sort vertices apply DFS (Depth First Search) and then arrange vertices in
linear order by decreasing order of finish time.
Now, take each vertex in topologically sorted order and relax each edge.
1. adj [s] →t, x
2. 0 + 3 < ∞
3. d [t] ← 3
4. 0 + 2 < ∞
5. d [x] ← 2
1. adj [t] → r, x
2. 3 + 1 < ∞
3. d [r] ← 4
4. 3 + 5 ≤ 2
1. adj [x] → y
2. 2 - 3 < ∞
3. d [y] ← -1
1. adj [y] → r
2. -1 + 4 < 4
3. 3 <4
4. d [r] ← 3
Thus the Shortest Path is:
1. s to x is 2
2. s to y is -1
3. s to t is 3
4. s to r is 3
Algorithm Cost
Floyd-Warshall O (V3)
Floyd-Warshall Algorithm:
Floyd-Warshall Algorithm is an algorithm for finding the shortest path between all the
pairs of vertices in a weighted graph. This algorithm works for both the directed and
undirected weighted graphs. But, it does not work for the graphs with negative cycles
(where the sum of the edges in a cycle is negative).
A weighted graph is a graph in which each edge has a numerical value associated with it.
Initial graph
Follow the steps below to find the shortest path between all the pairs of vertices.
1. Create a matrix A1 of dimension n*n where n is the number of vertices. The row and the
column are indexed as i and j respectively. i and j are the vertices of the graph.
Each cell A[i][j] is filled with the distance from the ith vertex to the jth vertex. If there is
no path from ith vertex to jth vertex, the cell is left as infinity.
Fill each cell with the distance between ith and jth vertex
2. Now, create a matrix A1 using matrix A0 . The elements in the first column and the first row
are left as they are. The remaining cells are filled in the following way.
Let k be the intermediate vertex in the shortest path from source to destination. In this
step, k is the first vertex. A[i][j] is filled with (A[i][k] + A[k][j]) if (A[i][j] > A[i][k] +
A[k][j]) .
That is, if the direct distance from the source to the destination is greater than the path
through the vertex k , then the cell is filled with A[i][k] + A[k][j] .
In this step, k is vertex 1. We calculate the distance from source vertex to destination
vertex through this vertex k.
3. Calculate the distance from the source vertex to destination vertex through this vertex k
For example: For A1[2, 4] , the direct distance from vertex 2 to 4 is 4 and the sum of the
distance from vertex 2 to 4 through vertex (ie. from vertex 2 to 1 and from vertex 1 to 4) is
7. Since 4 < 7 , A0[2, 4] is filled with 4.
Similarly, A2 is created using A3 . The elements in the second column and the second row
are left as they are.
In this step, k is the second vertex (i.e. vertex 2). The remaining steps are the same as
in step 2
.
4. Calculate the distance from the source vertex to destination vertex through this vertex
2
Similarly, A3 and A4 is also created.
5. Calculate the distance from the source vertex to destination vertex through this vertex
6. 6. Calculate the distance from the source vertex to destination vertex through this vertex
4
A4 gives the shortest path between each pair of vertices
.
Floyd-Warshall Algorithm
n = no of vertices
for k = 1 to n
for i = 1 to n
for j = 1 to n
There are three loops. Each loop has constant complexities. So, the time complexity of
the Floyd-Warshall algorithm is O(n3).
Space Complexity
P-Class(Polynomial)
The class P consists of those problems that are solvable in polynomial time, i.e. these problems can
be solved in time O(nk) in worst-case, where k is constant.
These problems are called tractable, while others are called intractable or superpolynomial.
Formally, an algorithm is polynomial time algorithm, if there exists a polynomial p(n) such that the
algorithm can solve any instance of size n in a time O(p(n)).
Problem requiring Ω(n50) time to solve are essentially intractable for large n. Most known polynomial
time algorithm run in time O(nk) for fairly low value of k.
The advantages in considering the class of polynomial-time algorithms is that all
reasonable deterministic single processor model of computation can be simulated on each
other with at most a polynomial slow-d
NP-Class(Non-Polynomial)
The class NP consists of those problems that are verifiable in polynomial time. NP is the class of
decision problems for which it is easy to check the correctness of a claimed answer, with the aid of
a little extra information. Hence, we aren’t asking for a way to find a solution, but only to verify that
an alleged solution really is correct.
Every problem in this class can be solved in exponential time using exhaustive search.
P versus NP
Every decision problem that is solvable by a deterministic polynomial time algorithm is also solvable
by a polynomial time non-deterministic algorithm.
All problems in P can be solved with polynomial time algorithms, whereas all problems in NP - P are
intractable.
It is not known whether P = NP. However, many problems are known in NP with the property that if
they belong to P, then it can be proved that P = NP.
If P ≠ NP, there are problems in NP that are neither in P nor in NP-Complete.
The problem belongs to class P if it’s easy to find a solution for the problem. The problem belongs
to NP, if it’s easy to check a solution that may have been very tedious to find.
If a polynomial time algorithm exists for any of these problems, all problems in NP would be
polynomial time solvable. These problems are called NP-complete. The phenomenon of NP-
completeness is important for both theoretical and practical reasons.
Definition of NP-Completeness
A language B is NP-complete if it satisfies two conditions
B is in NP
Every A in NP is polynomial time reducible to B.
If a language satisfies the second property, but not necessarily the first one, the language B is
known as NP-Hard. Informally, a search problem B is NP-Hard if there exists some NP-
Complete problem A that Turing reduces to B.
The problem in NP-Hard cannot be solved in polynomial time, until P = NP. If a problem is proved to
be NPC, there is no need to waste time on trying to find an efficient algorithm for it. Instead, we can
focus on design approximation algorithm.
NP-Complete Problems
Following are some NP-Complete problems, for which no polynomial time algorithm is known.
NP-Hard Problems
The following problems are NP-Hard
TSP is NP-Complete
The traveling salesman problem consists of a salesman and a set of cities. The salesman has to
visit each one of the cities starting from a certain one and returning to the same city. The challenge
of the problem is that the traveling salesman wants to minimize the total length of the trip.
Proof
To prove TSP is NP-Complete, first we have to prove that TSP belongs to NP. In TSP, we find a
tour and check that the tour contains each vertex once. Then the total cost of the edges of the tour
is calculated. Finally, we check if the cost is minimum. This can be completed in polynomial time.
Thus TSP belongs to NP.
Secondly, we have to prove that TSP is NP-hard. To prove this, one way is to show
that Hamiltonian cycle ≤p TSP (as we know that the Hamiltonian cycle problem is NPcomplete).
Assume G = (V, E) to be an instance of Hamiltonian cycle.
Hence, an instance of TSP is constructed. We create the complete graph G' = (V, E'), where
E′={(i,j):i, j∈V and i≠ jE′={(i,j):i,j∈V and i≠j
Thus, the cost function is defined as follows −
t(i,j)={01if(i,j)∈E otherwise t(i,j)={0if(i,j)∈E1otherwise
Now, suppose that a Hamiltonian cycle h exists in G. It is clear that the cost of each edge
in h is 0 in G' as each edge belongs to E. Therefore, h has a cost of 0 in G'. Thus, if graph G has a
Hamiltonian cycle, then graph G' has a tour of 0 cost.
Conversely, we assume that G' has a tour h' of cost at most 0. The cost of edges
in E' are 0 and 1 by definition. Hence, each edge must have a cost of 0 as the cost of h' is 0. We
therefore conclude that h' contains only edges in E.
We have thus proven that G has a Hamiltonian cycle, if and only if G' has a tour of cost at most 0.
TSP is NP-complete.