DSA Unit 5
DSA Unit 5
GRAPH:
A Graph is a non-linear data structure that is the same as the mathematical (discrete
mathematics) concept of graphs, consisting of nodes and edges. The nodes are sometimes also
referred to as vertices and the edges are lines or arcs that connect any two nodes in the graph.
Graphs are used to represent the arbitrary relationship among objects. It can be either directed or
undirected.
then,
V(G)={A,B,C,D,E,F}
E(G)={(AB),(AC),(BD),(BC),(DE),(DF),(EF)}
GRAPH TERMINOLOGIES:
● Path: A path can be defined as the sequence of nodes that are followed in order to reach
some terminal node V from the initial node U.
● Closed Path: A path will be called a closed path if the initial node is the same as a
terminal node. A path will be closed path if V0=VN.
● Simple Path: If all the nodes of the graph are distinct with an exception V0=VN, then
such path P is called a closed simple path.
● Cycle: A cycle can be defined as the path which has no repeated edges or vertices except
the first and last vertices.
There are several types of graphs available, but some graphs are more commonly used, they are:
● Directed graph
● Undirected graph
● Weighted graph
● Unweighted graph
In a directed graph, edges form an ordered pair. Edges represent a specific path from some vertex
A to another vertex B. Node A is called initial node while node B is called terminal node.
WEIGHTED AND UNWEIGHTED GRAPH:
A weighted graph is a graph in which each branch is given a numerical weight. A
weighted graph is therefore a special type of labelled graph in which the labels are numbers
(which are usually taken to be positive). Whereas Unweighted graph doesn’t have any numerical
weight to it.
Representations of Graph
Here are the two most common ways to represent a graph :
1. Adjacency Matrix
2. Adjacency List
Adjacency Matrix
An adjacency matrix is a way of representing a graph as a matrix of boolean (0’s and 1’s).
Let’s assume there are n vertices in the graph So, create a 2D matrix adjMat[n][n] having
dimension n x n.
● If there is an edge from vertex i to j, mark adjMat[i][j] as 1.
● If there is no edge from vertex i to j, mark adjMat[i][j] as 0.
Representation of Undirected Graph to Adjacency Matrix:
The below figure shows an undirected graph. Initially, the entire Matrix is initialized to 0. If
there is an edge from source to destination, we insert 1 to both cases (adjMat[destination] and
adjMat[destination]) because we can go either way.
Undirected Graph to Adjacency Matrix
Representation of Directed Graph to Adjacency Matrix:
The below figure shows a directed graph. Initially, the entire Matrix is initialized to 0. If there is
an edge from source to destination, we insert 1 for that particular adjMat[destination].
Adjacency List
An array of Lists is used to store edges between two vertices. The size of array is equal to the
number of vertices (i.e, n). Each index in this array represents a specific vertex in the graph. The
entry at the index i of the array contains a linked list containing the vertices that are adjacent to
vertex i.
Let’s assume there are n vertices in the graph So, create an array of list of size n as adjList[n].
● adjList[0] will have all the nodes which are connected (neighbour) to vertex 0.
● adjList[1] will have all the nodes which are connected (neighbour) to vertex 1 and so
on.
Representation of Undirected Graph to Adjacency list:
The below undirected graph has 3 vertices. So, an array of list will be created of size 3, where
each indices represent the vertices. Now, vertex 0 has two neighbours (i.e, 1 and 2). So, insert
vertex 1 and 2 at indices 0 of array. Similarly, For vertex 1, it has two neighbour (i.e, 2 and 1) So,
insert vertices 2 and 1 at indices 1 of array. Similarly, for vertex 2, insert its neighbours in array
of list.
Topological sorting for Directed Acyclic Graph (DAG) is a linear ordering of vertices such that
for every directed edge u-v, vertex u comes before v in the ordering.
Output: 5 4 2 3 1 0
Explanation: The first vertex in topological sorting is always a vertex with an in-degree of 0 (a
vertex with no incoming edges). A topological sorting of the following graph is “5 4 2 3 1 0”.
There can be more than one topological sorting for a graph. Another topological sorting of the
following graph is “4 5 2 3 1 0”.
Step 3:
● In this step , We choose node 2 because it has minimum number of adjacent nodes
after 0 and 1 .
● We call DFS for node 2 and push all the nodes which comes in traversal from node 2
in reverse order.
● So push 3 then push 2 .
Step 4:
● We now call DFS for node 4
● Because 0 and 1 already present in the stack so we just push node 4 in the stack and
return.
Step 5:
● In this step because all the adjacent nodes of 5 is already in the stack we push node 5
in the stack and return.
Step 6: This is the final step of the Topological sorting in which we pop all the element from the
stack and print it in that order .
Given a Directed Acyclic Graph having V vertices and E edges, your task is to find any
Topological Sorted order of the graph.
Topological Sorted order: It is a linear ordering of vertices such that for every directed edge u
-> v, where vertex u comes before v in the ordering.
Example:
Input: V=6 , E = {{2,3},{3,1},{4,0},{4,1},{5,0},{5,2}}
Output: 5 4 2 3 1 0
Explanation: In the above output, each dependent vertex is printed after the vertices it depends
upon.
Input: V=5 , E={{0,1},{1,2},{3,2},{3,4}}
Output: 0 3 4 1 2
Explanation: In the above output, each dependent vertex is printed after the vertices it depends
upon.
Algorithm:
● Add all nodes with in-degree 0 to a queue.
● While the queue is not empty:
● Remove a node from the queue.
● For each outgoing edge from the removed node, decrement the in-degree
of the destination node by 1.
● If the in-degree of a destination node becomes 0, add it to the queue.
● If the queue is empty and there are still nodes in the graph, the graph contains a cycle
and cannot be topologically sorted.
● The nodes in the queue represent the topological ordering of the graph.
Complexity Analysis:
● Time Complexity: O(V+E).
The outer for loop will be executed V number of times and the inner for loop will be
executed E number of times.
● Auxiliary Space: O(V).
The queue needs to store all the vertices of the graph. So the space required is O(V)
Introduction
In graph theory, a Minimum Spanning Tree (MST) for a weighted, connected, undirected graph
is a spanning tree with a weight less than or equal to the weight of every other spanning tree. The
weight of a spanning tree is the sum of weights given to each edge of the spanning tree.
Objective
To find the Minimum Spanning Tree of a given connected, undirected graph minimizing the total
edge weight.
Prim's Algorithm
Prim's algorithm is a greedy algorithm that finds a minimum spanning tree for a weighted
undirected graph. The algorithm operates by building the spanning tree one vertex at a time,
from an arbitrary starting vertex, at each step adding the cheapest possible connection from the
tree to another vertex.
Algorithm Steps
Initialization:
● Select an arbitrary vertex to start the tree from.
● Set the key value of the chosen vertex to 0 and to infinity for all other vertices.
● Set the parent of all vertices as -1 (or NULL).
Growing the MST:
● While the MST is not complete (not all vertices are included):
● Pick the vertex with the minimum key value that is not already included in
MST. Let's call this vertex u.
● Add u to the MST.
● Update the key value of all adjacent vertices of u. For every adjacent
vertex v, if the weight of the edge u-v is less than the current key value of
v, update the key value of v to the weight of u-v and set u as its parent.
Termination:
● The algorithm terminates when all the vertices are included in the MST.
Properties
● Greedy Method: Prim's algorithm follows the greedy approach, choosing the nearest
vertex not already included in the MST.
● Edge Weights: Can work with both negative and positive edge weights, as long as the
graph does not contain any cycles.
function primMST(graph)
for each vertex v in graph
key[v] = infinity
parent[v] = NULL
pick any initial vertex u
key[u] = 0
// The MST can be formed by the edges (v, parent[v]) for all vertices v
Prim's algorithm efficiently finds the minimum spanning tree of a connected, undirected
graph, enabling the construction of minimal paths or networks. The choice of data
structure for the priority queue can significantly affect the algorithm's performance,
making it adaptable to different graph sizes and densities.
Algorithm Analysis
Time Complexity
● Represents the amount of time an algorithm takes to complete as a function of the length
of the input.
● Usually expressed in Big O notation, e.g., O(n), O(log n).
Space Complexity
● Represents the amount of memory an algorithm uses as a function of the length of the
input.
● Also often expressed in Big O notation.
Best, Worst, and Average Case Analysis
● Best Case: The scenario where the algorithm performs the minimum number of steps.
● Worst Case: The scenario where the algorithm performs the maximum number of steps.
● Average Case: The expected number of steps the algorithm will take over all possible
inputs.
Greedy algorithms
Greedy algorithms are a class of algorithms that make locally optimal choices at each step with
the hope of finding a global optimum solution. In these algorithms, decisions are made based on
the information available at the current moment without considering the consequences of these
decisions in the future. The key idea is to select the best possible choice at each step, leading to a
solution that may not always be the most optimal but is often good enough for many problems.
The general structure of a greedy algorithm can be summarized in the following steps:
1. Identify the problem as an optimization problem where we need to find the best
solution among a set of possible solutions.
2. Determine the set of feasible solutions for the problem.
3. Identify the optimal substructure of the problem, meaning that the optimal solution to
the problem can be constructed from the optimal solutions of its subproblems.
4. Develop a greedy strategy to construct a feasible solution step by step, making the
locally optimal choice at each step.
Prove the correctness of the algorithm by showing that the locally optimal choices at
each step lead to a globally optimal solution.
For example: consider the Fractional Knapsack Problem. The local optimal strategy is to choose
the item that has maximum value vs weight ratio. This strategy also leads to a globally optimal
solution because we are allowed to take fractions of an item.
Fractional Knapsack
Fractional Knapsack Problem
Given the weights and profits of N items, in the form of {profit, weight} put these items in a
knapsack of capacity W to get the maximum total profit in the knapsack. In Fractional
Knapsack, we can break items for maximizing the total value of the knapsack.
Backtracking Algorithms
Backtracking algorithms are like problem-solving strategies that help explore different options
to find the best solution. They work by trying out different paths and if one doesn’t work, they
backtrack and try another until they find the right one. It’s like solving a puzzle by testing
different pieces until they fit together perfectly.
Backtracking is a systematic method of trying out various sequences of decisions until you find
out that works. Let's understand through an example.
We start with a start node. First, we move to node A. Since it is not a feasible solution so we
move to the next node, i.e., B. B is also not a feasible solution, and it is a dead-end so we
backtrack from node B to node A.
Suppose another path exists from node A to node C. So, we move from node A to node C. It is
also a dead-end, so again backtrack from node C to node A. We move from node A to the
starting node.
Now we will check if any other path exists from the starting node. So, we move from start node
to node D. Since it is not a feasible solution, we move from node D to node E. Node E is also not
a feasible solution. It is a dead end so we backtrack from node E to node D.
Suppose another path exists from node D to node F. So, we move from node D to node F. Since it
is not a feasible solution and it's a dead-end, we check for another path from node F.
Suppose there is another path exists from the node F to node G so move from node F to node G.
The node G is a success node.
The terms related to the backtracking are:
○ Live node: The nodes that can be further generated are known as live nodes.
○ E node: The nodes whose children are being generated and become a success node.
○ Success node: The node is said to be a success node if it provides a feasible solution.
○ Dead node: The node which cannot be further generated and also does not provide a
feasible solution is known as a dead node.
Many problems can be solved by backtracking strategy, and that problems satisfy complex set of
constraints, and these constraints are of two types:
○ Implicit constraint: It is a rule in which how each element in a tuple is related.
○ Explicit constraint: The rules that restrict each element to be chosen from the given set.
Applications of Backtracking
○ N-queen problem
○ Sum of subset problem
○ Graph coloring
○ Hamiliton cycle
Steps: Divide the unsorted list into n sublists, each containing one element (a list of one element
is considered sorted).
Repeatedly merge sublists to produce new sorted sublists until there is only one sublist
remaining. This will be the sorted list.
Algorithm MergeSort(arr, left, right)
if left < right
// Find the middle point to divide the array into two halves
middle = (left + right) / 2
Merge Function: