0% found this document useful (0 votes)
26 views34 pages

Daa Unit3

The document provides an overview of graph traversal algorithms, specifically Breadth First Search (BFS) and Depth First Search (DFS), detailing their definitions, data structures used, and suitable applications. It also explains Dijkstra's algorithm for finding the shortest paths from a source node in a graph, including its prerequisites, algorithm steps, and time complexity. Additionally, it introduces the Floyd-Warshall algorithm for finding shortest paths between all pairs of vertices in a weighted graph.

Uploaded by

tamannaramdev26
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views34 pages

Daa Unit3

The document provides an overview of graph traversal algorithms, specifically Breadth First Search (BFS) and Depth First Search (DFS), detailing their definitions, data structures used, and suitable applications. It also explains Dijkstra's algorithm for finding the shortest paths from a source node in a graph, including its prerequisites, algorithm steps, and time complexity. Additionally, it introduces the Floyd-Warshall algorithm for finding shortest paths between all pairs of vertices in a weighted graph.

Uploaded by

tamannaramdev26
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

UNIT-3

GRAPH TRAVERSAL

Parameters BFS DFS

Stands for BFS stands for Breadth First Search. DFS stands for Depth First Search.

Data BFS(Breadth First Search) uses Queue data


DFS(Depth First Search) uses Stack data structure.
Structure structure for finding the shortest path.

DFS is also a traversal approach in which the


BFS is a traversal approach in which we
traverse begins at the root node and proceeds
first walk through all nodes on the same
through the nodes as far as possible until we reach
level before moving on to the next level.
Definition the node with no unvisited nearby nodes.

Conceptual
BFS builds the tree level by level. DFS builds the tree sub-tree by sub-tree.
Difference

Approach It works on the concept of FIFO (First In


It works on the concept of LIFO (Last In First Out).
used First Out).

BFS is more suitable for searching vertices DFS is more suitable when there are solutions away
Suitable for closer to the given source. from source.

BFS is used in various applications such as DFS is used in various applications such as acyclic
bipartite graphs, shortest paths, etc. If graphs and finding strongly connected components
weight of every edge is same, then BFS etc. There are many applications where both BFS
gives shortest path from source to every and DFS can be used like Topological Sorting, Cycle
Applications other vertex. Detection, etc.

BFS ALGO:

Algorithm:

1. Initialization:

o Create a queue and add the start_node to it.

o Create a visited set (or array) and add start_node to it.

2. Traversal:

o While the queue is not empty:

▪ Dequeue a current_node from the front of the queue.

▪ Process the current_node (e.g., print it, check if it's the target).
▪ For each neighbor of current_node:

▪ If neighbor has not been visited:

▪ Add neighbor to the visited set.

▪ Enqueue neighbor to the queue.

Pseudocode:

BFS(graph, start_node):

queue = Queue()

queue.enqueue(start_node)

visited = Set()

visited.add(start_node)

while not queue.is_empty():

current_node = queue.dequeue()

print(current_node) // Or perform other processing

for neighbor in graph.get_neighbors(current_node):

if neighbor not in visited:

visited.add(neighbor)

queue.enqueue(neighbor)

DFS Algo

Algorithm (Iterative with Stack):

1. Initialization:

o Create a stack and push the start_node onto it.

o Create a visited set (or array) and add start_node to it.

2. Traversal:

o While the stack is not empty:

▪ Pop a current_node from the top of the stack.

▪ Process the current_node (e.g., print it, check if it's the target).

▪ For each neighbor of current_node (it's often good to iterate in reverse order if you want
consistent output, or just in any order):

▪ If neighbor has not been visited:

▪ Add neighbor to the visited set.

▪ Push neighbor onto the stack.

Pseudocode (Iterative):
DFS_Iterative(graph, start_node):

stack = Stack()

stack.push(start_node)

visited = Set()

visited.add(start_node)

while not stack.is_empty():

current_node = stack.pop()

print(current_node) // Or perform other processing

// Explore neighbors. Order might vary depending on implementation.

// Pushing in reverse order of desired visit sequence for consistent output

for neighbor in reversed(graph.get_neighbors(current_node)):

if neighbor not in visited:

visited.add(neighbor)

stack.push(neighbor)

Dijkstra's Algorithm: Explained

Dijkstra's algorithm, conceived by Edsger W. Dijkstra in 1956 and published in 1959, is a single-source shortest path
algorithm. This means it finds the shortest paths from a single starting node (the "source") to all other reachable
nodes in a graph.

Key Idea: The algorithm works by iteratively building a set of nodes for which it has determined the shortest path
from the source. It starts at the source node and then "explores" its neighbors, continually selecting the unvisited
node with the smallest known distance from the source.

Analogy: Imagine you're trying to find the quickest route from your house to all your friends' houses. You start at
your house (source). You know the time it takes to get to your immediate neighbors. You pick the closest neighbor,
mark their house as "visited" (meaning you've found the quickest way there), and then consider how that neighbor's
house connects you to their unvisited neighbors, potentially finding a quicker route to those unvisited neighbors than
you previously knew. You repeat this process, always picking the unvisited house that's currently "closest" to your
starting point, until all houses are visited or unreachable.

Prerequisites/Limitations:

• Non-negative Edge Weights: Dijkstra's algorithm requires all edge weights to be non-negative (i.e., 0 or
positive). If your graph has negative edge weights, Dijkstra's will not work correctly. For graphs with negative
edge weights, you'd typically use algorithms like Bellman-Ford.

• Directed or Undirected Graphs: It works for both. For undirected graphs, you can treat each undirected edge
as two directed edges (one in each direction) with the same weight.

Algorithm (with Pseudocode)

We need two main data structures:

1. distances dictionary/map/array: Stores the shortest known distance from the source node to every other
node. Initially, all distances are ∞ (infinity) except for the source node, which is 0.
2. priority_queue (min-heap): Stores (distance, node) pairs. It's crucial for efficiently selecting the unvisited
node with the smallest current distance.

3. visited set/array: Keeps track of nodes for which the shortest path from the source has already been
finalized.

Dijkstra's Algorithm Steps:

1. Initialization:

o Create a distances dictionary/map. For every node v in the graph, set distances[v] = infinity.

o Set distances[source_node] = 0.

o Create a priority_queue (min-heap). Add (0, source_node) to it.

o Create an empty visited set. (Alternatively, you can use a boolean array if node IDs are integers).

2. Main Loop:

o While the priority_queue is not empty:

▪ Extract the node u with the smallest current_distance from the priority_queue.

▪ Check for visited: If u is already in visited, continue to the next iteration (this is important
because a node might be added to the priority queue multiple times with different distances,
but we only care about the first time it's extracted, which is guaranteed to be its shortest
path).

▪ Add u to visited.

▪ Relaxation: For each neighbor_v of u:

▪ Calculate the new_distance = current_distance + weight(u, neighbor_v).

▪ If new_distance < distances[neighbor_v]:

▪ Update distances[neighbor_v] = new_distance.

▪ Add (new_distance, neighbor_v) to the priority_queue.

3. Result:

o Once the priority_queue is empty, the distances dictionary will contain the shortest path distances
from the source_node to all other reachable nodes.

Pseudocode:

Dijkstra(Graph, source_node):

// Initialize distances: All nodes are initially at infinite distance, except source

distances = {}

for each node v in Graph.nodes:

distances[v] = infinity

distances[source_node] = 0
// Priority queue: stores (distance, node) pairs, ordered by distance

// A min-heap implementation is typical here.

priority_queue = MinPriorityQueue()

priority_queue.add((0, source_node))

// Visited set: to keep track of nodes whose shortest path is finalized

visited = Set()

while not priority_queue.is_empty():

// Get the node with the smallest distance from the priority queue

current_distance, current_node = priority_queue.extract_min()

// If we've already finalized the shortest path to this node, skip

if current_node in visited:

continue

// Mark the current node as visited (its shortest path is now known)

visited.add(current_node)

// Explore neighbors of the current node

for each neighbor, weight in Graph.get_neighbors(current_node):

// Calculate new distance to neighbor through current_node

new_distance = current_distance + weight

// If this new path is shorter than the previously known shortest path to neighbor

if new_distance < distances[neighbor]:

distances[neighbor] = new_distance

priority_queue.add((new_distance, neighbor)) // Add or update in priority queue

return distances

Example Walkthrough

Let's consider a simple directed graph:


Nodes: A, B, C, D, E

Edges:

A --5--> B

A --3--> C

B --2--> D

C --4--> D

C --8--> E

D --1--> E

Source node: A

Initialization:

• distances = {A: 0, B: ∞, C: ∞, D: ∞, E: ∞}

• priority_queue = [(0, A)]

• visited = {}

Iteration 1:

1. Extract (0, A) from priority_queue.

2. current_node = A, current_distance = 0.

3. Add A to visited. visited = {A}.

4. Neighbors of A:

o B (weight 5): new_distance = 0 + 5 = 5. Since 5 < ∞, update distances[B] = 5. Add (5, B) to


priority_queue.

o C (weight 3): new_distance = 0 + 3 = 3. Since 3 < ∞, update distances[C] = 3. Add (3, C) to


priority_queue.

• distances = {A: 0, B: 5, C: 3, D: ∞, E: ∞}

• priority_queue = [(3, C), (5, B)] (order might vary based on PQ implementation, but (3, C) will be extracted
next)

Iteration 2:

1. Extract (3, C) from priority_queue.

2. current_node = C, current_distance = 3.

3. Add C to visited. visited = {A, C}.

4. Neighbors of C:

o D (weight 4): new_distance = 3 + 4 = 7. Since 7 < ∞, update distances[D] = 7. Add (7, D) to


priority_queue.

o E (weight 8): new_distance = 3 + 8 = 11. Since 11 < ∞, update distances[E] = 11. Add (11, E) to
priority_queue.

• distances = {A: 0, B: 5, C: 3, D: 7, E: 11}

• priority_queue = [(5, B), (7, D), (11, E)]


Iteration 3:

1. Extract (5, B) from priority_queue.

2. current_node = B, current_distance = 5.

3. Add B to visited. visited = {A, C, B}.

4. Neighbors of B:

o D (weight 2): new_distance = 5 + 2 = 7. Current distances[D] is 7. 7 < 7 is false. No update.

• distances = {A: 0, B: 5, C: 3, D: 7, E: 11}

• priority_queue = [(7, D), (11, E)]

Iteration 4:

1. Extract (7, D) from priority_queue.

2. current_node = D, current_distance = 7.

3. Add D to visited. visited = {A, C, B, D}.

4. Neighbors of D:

o E (weight 1): new_distance = 7 + 1 = 8. Current distances[E] is 11. Since 8 < 11, update distances[E] =
8. Add (8, E) to priority_queue.

• distances = {A: 0, B: 5, C: 3, D: 7, E: 8}

• priority_queue = [(8, E), (11, E)] (Note: (11, E) is still in PQ but will be ignored when extracted because E is
visited with a shorter path)

Iteration 5:

1. Extract (8, E) from priority_queue.

2. current_node = E, current_distance = 8.

3. Add E to visited. visited = {A, C, B, D, E}.

4. Neighbors of E: None.

• distances = {A: 0, B: 5, C: 3, D: 7, E: 8}

• priority_queue = [(11, E)]

Iteration 6:

1. Extract (11, E) from priority_queue.

2. current_node = E.

3. E is already in visited. Skip.

• priority_queue is now empty.

Final distances:

• A: 0

• B: 5 (A -> B)

• C: 3 (A -> C)

• D: 7 (A -> C -> D)
• E: 8 (A -> C -> D -> E)

Time Complexity

The time complexity of Dijkstra's algorithm depends on the implementation of the priority queue:

• Using a Min-Priority Queue (Binary Heap): This is the most common and efficient implementation.

o Each edge relaxation (potential update and addition to PQ) takes O(logV) time (where V is the
number of vertices).

o There are E edges (so E relaxations) and V extractions from the PQ (each taking O(logV)).

o Total Time Complexity: O((V+E)logV)

o In a sparse graph (E≈V), it's roughly O(VlogV).

o In a dense graph (E≈V2), it's roughly O(V2logV).

• Using a Fibonacci Heap (more complex, theoretical improvement for dense graphs):

o Total Time Complexity: O(E+VlogV)

• Using a simple array/list (no priority queue): This is highly inefficient.

o Finding the minimum takes O(V) in each iteration.

o Total Time Complexity: O(V2)

Applications of Dijkstra's Algorithm

Dijkstra's algorithm has numerous real-world applications:

• GPS Navigation Systems: Finding the shortest route between two locations on a map (where intersections
are nodes and roads are edges with travel times/distances as weights).

• Network Routing Protocols: Used by routers to find the best path for data packets to travel across a network
(e.g., OSPF).

• Finding shortest flight paths: In airline route planning.

• Resource allocation: Optimizing resource distribution in various systems.

• Solving mazes: Finding the shortest path through a maze.

• Minimizing costs: In various operational research problems.

Dijkstra's algorithm is a cornerstone of graph theory and an essential tool for anyone working with interconnected
data.

Floyd-Warshall Algorithm: Explained

The Floyd-Warshall algorithm is a powerful and elegant dynamic programming algorithm used to find the shortest
path between all pairs of vertices in a weighted graph. It can handle graphs with both positive and negative edge
weights, but it cannot handle graphs with negative cycles (as a negative cycle implies that the shortest path is
undefined, as you can keep traversing the cycle to get infinitely smaller path lengths).

Core Idea (Dynamic Programming Intuition):

Imagine you want to find the shortest path from vertex i to vertex j. The path can either be a direct edge from i to j,
or it can go through some intermediate vertices.

Floyd-Warshall iteratively improves its estimates of shortest paths by considering more and more intermediate
vertices. It uses a triple nested loop structure:
• The outermost loop (k) iterates through every possible vertex in the graph. In each iteration, k is considered
as a potential intermediate vertex on the path between any two other vertices i and j.

• The two inner loops (i and j) iterate through all possible pairs of source and destination vertices.

For each pair (i, j), the algorithm asks: "Is the current shortest path from i to j shorter, or is it shorter to go from i to k
and then from k to j?"

This can be expressed by the recurrence relation:

D[k][i][j]=min(D[k−1][i][j],D[k−1][i][k]+D[k−1][k][j])

Where D[k][i][j] represents the shortest path from i to j using only intermediate vertices from the set {0, 1, ..., k}.
However, in practice, we usually optimize the space and use a 2D matrix dist[i][j] that gets updated in place.

Algorithm Steps (Using a 2D Distance Matrix)

1. Initialization:

o Create a square matrix dist of size V×V (where V is the number of vertices).

o Populate dist with the initial edge weights:

▪ If there's a direct edge from vertex i to vertex j, dist[i][j] = weight(i, j).

▪ If i is the same as j, dist[i][j] = 0 (distance from a node to itself is zero).

▪ If there's no direct edge from i to j, dist[i][j] = infinity (or a very large number, representing
unreachable).

2. Main Iteration (Dynamic Programming):

o Iterate k from 0 to V-1 (inclusive):

▪ This k represents the "intermediate" vertex being considered.

▪ Iterate i from 0 to V-1 (inclusive):

▪ This i represents the "source" vertex.

▪ Iterate j from 0 to V-1 (inclusive):

▪ This j represents the "destination" vertex.

▪ Relaxation Step: Check if the path from i to j can be made shorter by going
through k.

▪ dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j])

▪ Important Note on Infinity: When performing dist[i][k] + dist[k][j],


you must be careful if dist[i][k] or dist[k][j] is infinity. infinity + x
should still be infinity. In implementations, use a very large number
for infinity and add checks to prevent overflow or incorrect
calculations if adding to infinity. For example, if dist[i][k] != infinity
and dist[k][j] != infinity: ...

3. Negative Cycle Detection (Optional but Recommended):

o After the main loops complete, iterate through the diagonal of the dist matrix:
▪ If dist[i][i] is ever less than 0 for any i, it indicates the presence of a negative cycle reachable
from i. In such a scenario, shortest paths are not well-defined for nodes involved in or
reachable from that cycle.

4. Result:

o The dist matrix now contains the shortest path distances between all pairs of vertices. dist[i][j] holds
the shortest distance from vertex i to vertex j.

Pseudocode

FloydWarshall(graph_adj_matrix):

V = number of vertices in graph

// Initialize dist matrix with direct edge weights

// dist[i][j] will store the shortest distance from i to j

dist = create V x V matrix

for i from 0 to V-1:

for j from 0 to V-1:

if i == j:

dist[i][j] = 0

else if graph_adj_matrix[i][j] exists (direct edge):

dist[i][j] = graph_adj_matrix[i][j] // Use the weight of the direct edge

else:

dist[i][j] = INFINITY // A very large number

// Main loops: k as intermediate vertex

for k from 0 to V-1:

for i from 0 to V-1:

for j from 0 to V-1:

// If dist[i][k] or dist[k][j] is INFINITY, adding them might cause overflow

// or lead to incorrect results. Ensure both parts of the path are reachable.

if dist[i][k] != INFINITY and dist[k][j] != INFINITY:

dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j])

// Optional: Detect Negative Cycles

// If any dist[i][i] is negative, it means a negative cycle exists


for i from 0 to V-1:

if dist[i][i] < 0:

print "Graph contains a negative cycle accessible from vertex", i

// You might choose to raise an error, return a special value,

// or indicate that paths involving this cycle are undefined.

return dist

Example Walkthrough

Consider the following graph:

Nodes: 0, 1, 2, 3

Edges:

0 --3--> 1

0 --8--> 3

1 --4--> 2

3 --2--> 2

3 --7--> 0

2 --(-5)--> 1 (Negative weight!)

Initial dist matrix:

(Using a large number like 999 for INFINITY)

0 1 2 3

0 [0, 3, 999, 8]

1 [999, 0, 4, 999]

2 [999, -5, 0, 999]

3 [7, 999, 2, 0]

Iteration k = 0 (considering vertex 0 as intermediate):

• dist[i][j] = min(dist[i][j], dist[i][0] + dist[0][j])

• Example: dist[3][1]

o Current dist[3][1] is 999.

o Path 3 -> 0 -> 1: dist[3][0] + dist[0][1] = 7 + 3 = 10.

o dist[3][1] becomes min(999, 10) = 10.

• ... (apply this to all i, j pairs)

New dist matrix after k=0:

0 1 2 3
0 [0, 3, 999, 8]

1 [999, 0, 4, 999]

2 [999, -5, 0, 999]

3 [7, 10, 2, 0] // Note 3->1 is now 10 (3->0->1)

Iteration k = 1 (considering vertex 1 as intermediate):

• dist[i][j] = min(dist[i][j], dist[i][1] + dist[1][j])

• Example: dist[0][2]

o Current dist[0][2] is 999.

o Path 0 -> 1 -> 2: dist[0][1] + dist[1][2] = 3 + 4 = 7.

o dist[0][2] becomes min(999, 7) = 7.

• Example: dist[2][0]

o Current dist[2][0] is 999.

o Path 2 -> 1 -> 0: dist[2][1] + dist[1][0] is -5 + 999 = 994. Still 999. No update.

• ...

New dist matrix after k=1:

0 1 2 3

0 [0, 3, 7, 8] // 0->2 now 7 (0->1->2)

1 [999, 0, 4, 999]

2 [999, -5, 0, 999]

3 [7, 10, 2, 0]

Iteration k = 2 (considering vertex 2 as intermediate):

• dist[i][j] = min(dist[i][j], dist[i][2] + dist[2][j])

• Example: dist[0][1]

o Current dist[0][1] is 3.

o Path 0 -> 2 -> 1: dist[0][2] + dist[2][1] = 7 + (-5) = 2.

o dist[0][1] becomes min(3, 2) = 2. (Path 0 -> 1 was 3, now 0 -> 2 -> 1 is 2)

New dist matrix after k=2:

0 1 2 3

0 [0, 2, 7, 8] // 0->1 now 2 (0->2->1)

1 [999, 0, 4, 999]

2 [999, -5, 0, 999]

3 [7, -3, 2, 0] // 3->1 now -3 (3->2->1)

Iteration k = 3 (considering vertex 3 as intermediate):

• dist[i][j] = min(dist[i][j], dist[i][3] + dist[3][j])


• Example: dist[0][0]

o Current dist[0][0] is 0.

o Path 0 -> 3 -> 0: dist[0][3] + dist[3][0] = 8 + 7 = 15.

o dist[0][0] remains min(0, 15) = 0.

• Example: dist[1][0]

o Current dist[1][0] is 999.

o Path 1 -> 3 -> 0: dist[1][3] + dist[3][0] = 999 + 7 = 1006. (Still 999).

• Example: dist[1][1]

o Current dist[1][1] is 0.

o Path 1 -> 3 -> 1: dist[1][3] + dist[3][1] = 999 + (-3) = 996. (Still 0).

New dist matrix after k=3 (Final dist):

0 1 2 3

0 [0, 2, 7, 8]

1 [9, 0, 4, 11] // 1->0: 1->2->1 (999) now 9 (1->2->3->0), 1->3: 11 (1->2->3)

2 [4, -5, 0, 6] // 2->0: 4 (2->1->3->0), 2->3: 6 (2->1->3)

3 [7, -3, 2, 0]

Final Check for Negative Cycles: Look at the diagonal elements: dist[0][0]=0, dist[1][1]=0, dist[2][2]=0, dist[3][3]=0.
None are negative, so no negative cycles were detected.

The final dist matrix gives the shortest path between every pair of vertices. For instance, the shortest path from
vertex 3 to vertex 1 is -3.

Time and Space Complexity

• Time Complexity: O(V3), where V is the number of vertices. This is because there are three nested loops,
each iterating V times.

• Space Complexity: O(V2) to store the dist matrix.

Advantages and Disadvantages

Advantages:

• Simplicity: Conceptually straightforward and relatively easy to implement.

• Handles Negative Weights: Works correctly with negative edge weights, as long as there are no negative
cycles.

• All-Pairs: Directly solves the problem of finding shortest paths between all pairs of vertices.

• Compact Code: The implementation is typically very concise.

Disadvantages:

• Inefficient for Sparse Graphs: For sparse graphs (E≪V2), O(V3) can be less efficient than running Johnson's
algorithm (O(VE+V2logV)).
• Cannot Handle Negative Cycles (gracefully): While it can detect them, it doesn't provide meaningful shortest
paths if negative cycles exist. The distances involving such cycles would tend towards negative infinity.

• Higher Space Complexity for Very Large Graphs: O(V2) might be too much for extremely large graphs where
V is in the millions.

Floyd-Warshall is particularly well-suited for dense graphs or situations where V is small enough (e.g., V≤500) that V3
is acceptable, and you need all-pairs shortest paths.

Transitive Closure of a Graph

The transitive closure of a directed graph G=(V,E) is another directed graph G∗=(V,E∗), where E∗ contains an edge
(u,v) if and only if there is a path from vertex u to vertex v in the original graph G.

In simpler terms: If you can reach v from u by following one or more edges in the original graph, then there's a direct
edge from u to v in the transitive closure.

Analogy: Imagine a social network where an edge (A,B) means "A follows B." The transitive closure would tell you
who can eventually see whose posts. If A follows B, and B follows C, then A can "reach" C's posts indirectly. In the
transitive closure, there would be a direct edge (A, C).

Key Idea: The transitive closure essentially shows all reachable pairs in a graph.

How to Compute Transitive Closure

The most common and straightforward way to compute the transitive closure is by adapting the Floyd-Warshall
algorithm.

Using Modified Floyd-Warshall:

The standard Floyd-Warshall finds shortest paths. We can modify it to find reachability:

1. Initialization:

o Create a boolean matrix reach[i][j] of size V×V.

o Initialize reach[i][j] to true if there is a direct edge from i to j in the original graph, or if i == j (a node
can always reach itself).

o Initialize reach[i][j] to false otherwise.

2. Main Iteration (Dynamic Programming):

o Iterate k from 0 to V-1 (inclusive):

▪ This k represents the "intermediate" vertex.

▪ Iterate i from 0 to V-1 (inclusive):

▪ This i represents the "source" vertex.

▪ Iterate j from 0 to V-1 (inclusive):

▪ This j represents the "destination" vertex.

▪ Update Rule: reach[i][j] = reach[i][j] OR (reach[i][k] AND reach[k][j])

▪ Explanation:

▪ reach[i][j] (left side of OR): This represents the possibility that i can
already reach j without passing through k.
▪ (reach[i][k] AND reach[k][j]): This represents the possibility that i can
reach k, AND k can reach j. If both are true, then i can reach j by
going through k.

▪ By taking the OR, we are saying i can reach j if it either already could
or if it can now by passing through k.

3. Result:

o After the loops complete, reach[i][j] will be true if there is a path from i to j in the original graph, and
false otherwise. This reach matrix represents the adjacency matrix of the transitive closure graph G∗.

Pseudocode (Floyd-Warshall based)

TransitiveClosure(graph_adj_matrix):

V = number of vertices in graph

// Initialize reachability matrix

// reach[i][j] will be true if there is a path from i to j

reach = create V x V boolean matrix

for i from 0 to V-1:

for j from 0 to V-1:

if i == j:

reach[i][j] = true // A node can always reach itself

else if graph_adj_matrix[i][j] exists (direct edge):

reach[i][j] = true

else:

reach[i][j] = false

// Main loops: k as intermediate vertex

for k from 0 to V-1:

for i from 0 to V-1:

for j from 0 to V-1:

// If i can reach k AND k can reach j, then i can reach j

reach[i][j] = reach[i][j] OR (reach[i][k] AND reach[k][j])

return reach

Example Walkthrough
Consider a simple directed graph:

Edges:

0 -> 1

1 -> 2

2 -> 0

0 -> 3

Initial reach matrix:

0 1 2 3

0 [T, T, F, T] (0->0 is T, 0->1 is T, 0->2 is F, 0->3 is T)

1 [F, T, T, F]

2 [T, F, T, F]

3 [F, F, F, T]

Iteration k = 0 (considering vertex 0 as intermediate):

• reach[i][j] = reach[i][j] OR (reach[i][0] AND reach[0][j])

Let's check reach[1][3] (initially F):

• reach[1][0] is F.

• reach[1][3] remains F because reach[1][0] AND reach[0][3] is F AND T which is F.

Let's check reach[2][1] (initially F):

• reach[2][0] is T. reach[0][1] is T.

• reach[2][1] becomes F OR (T AND T) which is T. (Path: 2 -> 0 -> 1)

New reach after k=0:

0 1 2 3

0 [T, T, F, T]

1 [F, T, T, F]

2 [T, T, T, T] (2->1 now T, 2->3 now T (2->0->3))

3 [F, F, F, T]

Iteration k = 1 (considering vertex 1 as intermediate):

• reach[i][j] = reach[i][j] OR (reach[i][1] AND reach[1][j])

Let's check reach[0][2] (initially F):

• reach[0][1] is T. reach[1][2] is T.

• reach[0][2] becomes F OR (T AND T) which is T. (Path: 0 -> 1 -> 2)

New reach after k=1:

0 1 2 3

0 [T, T, T, T] // 0->2 now T


1 [F, T, T, F]

2 [T, T, T, T]

3 [F, F, F, T]

Iteration k = 2 (considering vertex 2 as intermediate):

• reach[i][j] = reach[i][j] OR (reach[i][2] AND reach[2][j])

Let's check reach[1][0] (initially F):

• reach[1][2] is T. reach[2][0] is T.

• reach[1][0] becomes F OR (T AND T) which is T. (Path: 1 -> 2 -> 0)

New reach after k=2:

0 1 2 3

0 [T, T, T, T]

1 [T, T, T, T] // 1->0 now T, also 1->3 now T (1->2->0->3)

2 [T, T, T, T]

3 [F, F, F, T]

Iteration k = 3 (considering vertex 3 as intermediate):

• reach[i][j] = reach[i][j] OR (reach[i][3] AND reach[3][j])

No significant changes here as vertex 3 only has paths to it and can reach itself. No other node can reach 3 and then
reach another node via 3 that wasn't already reachable.

Final reach matrix (Transitive Closure):

0 1 2 3

0 [T, T, T, T]

1 [T, T, T, T]

2 [T, T, T, T]

3 [F, F, F, T]

This matrix tells us, for example:

• reach[0][2] is T (There's a path 0 -> 1 -> 2).

• reach[3][0] is F (There's no path from 3 to 0).

Time and Space Complexity

• Time Complexity: O(V3), where V is the number of vertices. This is due to the three nested loops.

• Space Complexity: O(V2) to store the reach matrix.

Applications of Transitive Closure

• Reachability: Directly tells you which nodes are reachable from other nodes.
• Strongly Connected Components (SCCs): Can be used as a step in algorithms to find SCCs.

• Database Queries: In relational databases, to find relationships that are not directly stored but can be
inferred (e.g., finding all employees supervised directly or indirectly by a manager).

• Task Scheduling/Dependencies: Determining if one task depends on another, directly or indirectly.

• Type Hierarchy Analysis: In object-oriented programming, if class A is a subclass of B, and B is a subclass of C,


then A is also a subclass of C. Transitive closure can find all such relationships.

Minimum Spanning Tree (MST)

First, what is an MST?

Given a connected, undirected, and weighted graph, a spanning tree is a subgraph that includes all the vertices of
the original graph and forms a tree (i.e., it's connected and has no cycles).

A Minimum Spanning Tree (MST) is a spanning tree whose sum of edge weights is as small as possible. In other
words, it's the "cheapest" way to connect all the vertices in the graph without forming any cycles.

Key Properties of an MST:

• It connects all vertices using the minimum possible total edge weight.

• It contains exactly V−1 edges, where V is the number of vertices.

• It has no cycles.

Real-world applications:

• Network design: Laying out telecommunications, electrical, or transportation networks with minimal
cabling/cost.

• Cluster analysis: Grouping similar data points.

• Image segmentation: Identifying regions in an image.

• Circuit board design: Minimizing wire length.

Prim's Algorithm

Prim's algorithm builds an MST by starting from an arbitrary vertex and iteratively adding the cheapest edge that
connects a vertex in the growing MST to a vertex outside the MST. It's a "greedy" algorithm because it always makes
the locally optimal choice.

Analogy: Imagine you're building a network. You pick one building (vertex) to start. Then, you look at all the available
connections from your current network to buildings not yet connected. You pick the cheapest connection, add that
building to your network, and repeat.

Key Data Structure: Priority Queue (Min-Heap)

Algorithm Steps:

1. Initialization:

o Choose an arbitrary start_vertex from the graph.

o Maintain a min_cost array/dictionary where min_cost[v] stores the minimum weight of an edge
connecting v to a vertex already in the MST. Initialize min_cost[start_vertex] = 0 and min_cost[v] =
infinity for all other vertices.

o Maintain a parent array/dictionary to reconstruct the MST.


o Create a priority_queue (min-heap) and add (0, start_vertex) to it. The elements in the priority queue
are (weight, vertex).

o Create an empty mst_edges list to store the edges of the MST.

o Create a visited set to keep track of vertices already included in the MST.

2. Main Loop:

o While the priority_queue is not empty:

▪ Extract the (weight, u) pair with the minimum weight from the priority_queue.

▪ If u is already in visited, continue (this handles cases where a vertex is added to PQ multiple
times with different costs, we only process the cheapest path).

▪ Add u to visited.

▪ If u is not the start_vertex (i.e., it has a parent), add the edge (parent[u], u) to mst_edges.

▪ Explore Neighbors: For each neighbor_v of u and the edge_weight between u and
neighbor_v:

▪ If neighbor_v is not in visited AND edge_weight < min_cost[neighbor_v]:

▪ Update min_cost[neighbor_v] = edge_weight.

▪ Set parent[neighbor_v] = u.

▪ Add (edge_weight, neighbor_v) to the priority_queue.

3. Result:

o Once the priority_queue is empty (or V-1 edges are found), mst_edges will contain the edges of the
Minimum Spanning Tree.

Pseudocode:

Prim(Graph, start_vertex):

min_cost = {} // min_cost[v] = min weight to connect v to MST

parent = {} // parent[v] = vertex that connects v to MST

for each vertex v in Graph.vertices:

min_cost[v] = infinity

parent[v] = null

min_cost[start_vertex] = 0

priority_queue = MinPriorityQueue()

priority_queue.add((0, start_vertex)) // (cost, vertex)

visited = Set()

mst_edges = []

total_mst_weight = 0
while not priority_queue.is_empty() and len(visited) < Graph.num_vertices:

current_cost, u = priority_queue.extract_min()

if u in visited:

continue

visited.add(u)

total_mst_weight += current_cost

if parent[u] is not null: // If it's not the start_vertex

mst_edges.add((parent[u], u, current_cost))

for each neighbor_v, edge_weight in Graph.get_neighbors(u):

if neighbor_v not in visited and edge_weight < min_cost[neighbor_v]:

min_cost[neighbor_v] = edge_weight

parent[neighbor_v] = u

priority_queue.add((edge_weight, neighbor_v))

return mst_edges, total_mst_weight

Time Complexity:

• Using a Binary Heap (most common): O(ElogV) or O(ElogE) (since E≤V2, logE can be at most 2logV, so
O(ElogV) is a tighter bound).

• Using a Fibonacci Heap (more complex, theoretical improvement): O(E+VlogV). This is asymptotically faster
for very dense graphs.

Kruskal's Algorithm

Kruskal's algorithm also builds an MST greedily, but it does so by considering edges in increasing order of their
weights. It adds an edge to the MST if it connects two previously disconnected components.

Analogy: Imagine you have a bunch of islands (vertices) and you want to connect them all with bridges (edges) as
cheaply as possible. You list all possible bridges with their costs. You then start building the cheapest bridge first,
then the next cheapest, and so on, as long as building that bridge doesn't create a loop (i.e., it doesn't connect two
islands that are already connected).

Key Data Structure: Disjoint Set Union (DSU) / Union-Find

Algorithm Steps:

1. Initialization:
o Create a list of all edges in the graph, along with their weights.

o Sort all edges in non-decreasing order of their weights.

o Initialize a Disjoint Set Union (DSU) data structure, where each vertex is initially in its own separate
set (representing individual components).

o Create an empty mst_edges list to store the edges of the MST.

o num_edges_in_mst = 0.

2. Main Loop:

o Iterate through the sorted edges, one by one:

▪ Let the current edge be (u, v) with weight.

▪ Use the DSU find operation to determine the representative (root) of the sets containing u
and v.

▪ If find(u) is not equal to find(v) (meaning u and v are in different components and adding this
edge won't form a cycle):

▪ Add the edge (u, v) to mst_edges.

▪ Use the DSU union operation to merge the sets containing u and v.

▪ Increment num_edges_in_mst.

▪ If num_edges_in_mst reaches V-1 (where V is the number of vertices), you've found the MST,
and you can stop.

3. Result:

o mst_edges will contain the edges of the Minimum Spanning Tree.

Pseudocode:

Kruskal(Graph):

edges = List of all edges in Graph, with their weights

Sort edges by weight in non-decreasing order

dsu = DisjointSetUnion(Graph.num_vertices) // Initialize each vertex in its own set

mst_edges = []

total_mst_weight = 0

num_edges_in_mst = 0

for each edge (u, v, weight) in sorted_edges:

if num_edges_in_mst == Graph.num_vertices - 1:

break // MST found (V-1 edges)

// Check if u and v are already in the same component


if dsu.find(u) != dsu.find(v):

dsu.union(u, v)

mst_edges.add((u, v, weight))

total_mst_weight += weight

num_edges_in_mst += 1

return mst_edges, total_mst_weight

Time Complexity:

• Sorting edges: O(ElogE)

• DSU operations:

o V make_set operations (implicitly in initialization).

o Up to 2E find operations and V−1 union operations.

o With path compression and union by rank/size, these operations are nearly constant time (amortized
O(α(V)), where α is the inverse Ackermann function, which grows extremely slowly and is practically
less than 5 for any realistic V).

• Total Time Complexity: Dominated by sorting edges: O(ElogE) (or O(ElogV) since E≤V2⟹logE≤2logV).

Prim's vs. Kruskal's: When to Use Which?

Feature Prim's Algorithm Kruskal's Algorithm

Grows a single component by adding closest


Approach Adds edges to connect separate components.
vertex.

Data Structure Min-Priority Queue (Binary Heap) Disjoint Set Union (DSU) / Union-Find

Best for Dense Graphs (where E≈V2) Sparse Graphs (where E≈V)

Edge Cases Starts from a specific vertex. Treats edges globally.

More complex due to priority queue Simpler to implement due to sorting + DSU
Implementation
management. logic.

Time
O(ElogV) (Binary Heap) O(ElogE) (dominated by sorting)
Complexity

If you need to build the MST from a specific When edges can be easily sorted (e.g., from a
Typical Use
point. list).

Export to Sheets

In summary:

• Prim's is generally better for dense graphs because its complexity becomes closer to O(V2) (as ElogV can
approach V2logV but for very dense graphs E is large so V2 is the dominant term, and V2 algorithms are fast).

• Kruskal's is generally better for sparse graphs because its complexity is dominated by sorting, and ElogE will
be smaller than Prim's ElogV when E is small relative to V2.
Both algorithms are greedy and produce a correct Minimum Spanning Tree. Their choice often comes down to the
graph's characteristics (dense vs. sparse) and the preference for implementation complexity.

Topological Sorting

A topological sort (or topological ordering) of a directed graph is a linear ordering of its vertices such that for every
directed edge (u,v) from vertex u to vertex v, u comes before v in the ordering.

Key Requirements:

• Directed Graph: Topological sorting only applies to directed graphs (where edges have a direction, like u→v).

• Acyclic Graph (DAG): Crucially, the graph must not contain any directed cycles. If there's a cycle (e.g.,
A→B→C→A), it's impossible to create a linear ordering where every u comes before its v, because in a cycle,
A must come before B, B before C, and C before A – a contradiction.

Intuition/Analogy:

Think of it as ordering tasks with dependencies:

• Task A must be completed before Task B.

• Task B must be completed before Task C.

A topological sort would give you a valid sequence of tasks to perform, such as A, B, C. You can't start B before A, or C
before B.

Examples:

• Course Prerequisites: If "Calculus I" is a prerequisite for "Calculus II," then "Calculus I" must appear before
"Calculus II" in a topological sort of your course schedule.

• Build Systems: Compiling code where some files depend on others.

• Task Scheduling: Any set of tasks where some must be completed before others can begin.

• Dependency Resolution: In software packages, where installing one package requires others to be installed
first.

Important Notes:

• A DAG can have multiple valid topological orderings.

• If a graph is not a DAG (i.e., it contains a cycle), a topological sort is impossible.

Algorithms for Topological Sorting

There are two main algorithms for topological sorting, both with O(V+E) time complexity (where V is the number of
vertices and E is the number of edges):

1. Kahn's Algorithm (using BFS)

2. Depth-First Search (DFS) based Algorithm

1. Kahn's Algorithm (BFS-based)

This algorithm relies on the concept of in-degree. The in-degree of a vertex is the number of incoming edges to that
vertex.
Intuition: A vertex with an in-degree of 0 has no prerequisites; it can be the first task in the sequence. Once we
"complete" that task, we can remove it from consideration and decrement the in-degrees of its neighbors. Any
neighbor whose in-degree then becomes 0 can be the next task.

Algorithm Steps:

1. Calculate In-Degrees: For every vertex in the graph, compute its in-degree.

2. Initialize Queue: Create a queue and add all vertices with an in-degree of 0 to it. These are your starting
points.

3. Process Queue:

o While the queue is not empty:

▪ Dequeue a vertex u from the front of the queue.

▪ Add u to your topological sort result list.

▪ For each neighbor v of u (i.e., for every edge u -> v):

▪ Decrement the in-degree of v.

▪ If the in-degree of v becomes 0, enqueue v.

4. Check for Cycles: After the loop, if the number of vertices in your topological sort result list is less than the
total number of vertices in the graph, it means there was a cycle in the graph (and thus, a full topological sort
is not possible).

Pseudocode (Kahn's):

Kahn_TopologicalSort(Graph):

in_degree = Dictionary or Array to store in-degrees of all vertices

result = Empty List to store the topological order

queue = Queue()

// 1. Calculate In-Degrees for all vertices

for each vertex v in Graph.vertices:

in_degree[v] = 0

for each vertex u in Graph.vertices:

for each neighbor v of u: // for each edge u -> v

in_degree[v] = in_degree[v] + 1

// 2. Initialize Queue with all vertices having in-degree 0

for each vertex v in Graph.vertices:

if in_degree[v] == 0:

queue.enqueue(v)
// 3. Process Queue

count_visited_nodes = 0

while not queue.is_empty():

u = queue.dequeue()

result.add(u)

count_visited_nodes = count_visited_nodes + 1

for each neighbor v of u: // for each edge u -> v

in_degree[v] = in_degree[v] - 1

if in_degree[v] == 0:

queue.enqueue(v)

// 4. Check for Cycles

if count_visited_nodes != Graph.num_vertices:

print "Graph contains a cycle! Topological sort not possible."

return [] // Or raise an error

else:

return result

2. DFS-based Algorithm

This method uses a recursive Depth-First Search traversal. The key idea is that in a DFS, you visit a node's children
(dependencies) before you finish processing the node itself. For a topological sort, we want the opposite: a node
should appear after all its dependencies. Therefore, we add a node to the sorted list after visiting all of its adjacent
nodes.

Intuition: Think about finishing a task. You can only truly "finish" a task once all its sub-tasks are done. If you're
traversing a dependency graph, when you've explored an entire branch of dependencies and are about to
"backtrack" from a node, it means all its prerequisites have been handled. At this point, you can safely add the node
to the front of your topological order (or to a stack and then reverse the stack).

Algorithm Steps:

1. Initialize:

o Create a visited set to keep track of visited vertices (to avoid redundant work and infinite loops in
case of cycles, though for a true DAG, cycles aren't an issue for correctness, just efficiency).

o Create a recursion_stack (or path_stack) to detect cycles during DFS (optional for DAGs, but good
practice).

o Create an empty topological_order list (this will store the result in reverse, so you'll need to reverse it
at the end, or use a stack and pop).

2. Iterate and DFS:


o For each vertex v in the graph (in any arbitrary order):

▪ If v has not been visited:

▪ Call a recursive helper function DFS_Visit(v, visited, recursion_stack, graph,


topological_order).

3. DFS_Visit(u, visited, recursion_stack, graph, topological_order) Function:

o Mark u as visited.

o Add u to recursion_stack (for cycle detection).

o For each neighbor v of u:

▪ If v is not visited:

▪ Recursively call DFS_Visit(v, visited, recursion_stack, graph, topological_order).

▪ Else if v is in recursion_stack:

▪ This indicates a cycle! A topological sort is not possible. (This check is crucial if you're
not absolutely sure the input graph is a DAG).

o Remove u from recursion_stack (finished exploring all paths from u).

o Prepend u to topological_order (or push onto a stack). This is the key step: u is added after all its
dependencies have been processed.

4. Result:

o The topological_order list (after potentially reversing it if you appended to it during DFS_Visit) will
contain the topological sort.

Pseudocode (DFS-based):

DFS_TopologicalSort(Graph):

visited = Set()

recursion_stack = Set() // For cycle detection

topological_order = List() // Will build in reverse order

has_cycle = false

for each vertex v in Graph.vertices:

if v not in visited:

// Call helper function. Pass has_cycle by reference or return status.

// For simplicity here, we'll assume it sets a global flag or raises an exception.

DFS_Visit(v, visited, recursion_stack, Graph, topological_order, has_cycle)

if has_cycle:

print "Graph contains a cycle! Topological sort not possible."

return []
// Reverse the list to get the correct topological order

reverse(topological_order)

return topological_order

DFS_Visit(u, visited, recursion_stack, Graph, topological_order, has_cycle):

visited.add(u)

recursion_stack.add(u)

for each neighbor v of u: // for each edge u -> v

if v not in visited:

DFS_Visit(v, visited, recursion_stack, Graph, topological_order, has_cycle)

if has_cycle: // Propagate cycle detection back up

return

else if v in recursion_stack: // Found a back-edge to an ancestor in current DFS path

has_cycle = true // Set flag to indicate cycle

return

recursion_stack.remove(u) // Backtrack from u

topological_order.add(u) // Add u to the list. This adds it at the end.

// After all DFS calls for its children return.

Both algorithms are efficient and widely used. Kahn's algorithm is often preferred when you need to detect cycles
explicitly and/or when you want a non-recursive solution. The DFS-based approach is often more concise to
implement if recursion depth isn't an issue.

Network Flow Fundamentals

A flow network (or transportation network) is a directed graph G=(V,E) with the following characteristics:

1. Source (s): A distinguished vertex with only outgoing edges (no incoming edges). This is where the flow
originates.

2. Sink (t): A distinguished vertex with only incoming edges (no outgoing edges). This is where the flow
terminates.

3. Capacities (c(u, v)): Each edge (u,v)∈E has a non-negative capacity, c(u,v)≥0. This represents the maximum
amount of flow that can pass through that edge.

4. Flow (f(u, v)): For each edge (u,v), there is a flow f(u,v) such that:

o Capacity Constraint: 0≤f(u,v)≤c(u,v) (The flow on an edge cannot exceed its capacity).

o Skew Symmetry: f(u,v)=−f(v,u) (Flow from u to v is the negative of flow from v to u; often means you
don't allow flow in both directions, or if you do, net flow is conserved).
o Flow Conservation: For every vertex u∈V∖{s,t} (i.e., every vertex except the source and sink), the
total incoming flow must equal the total outgoing flow: (v,u)∈E∑f(v,u)=(u,w)∈E∑f(u,w) (What flows
into an intermediate node must flow out).

The value of a flow is the total flow leaving the source (which, by flow conservation, must also equal the total flow
entering the sink). The goal of many network flow problems is to find the maximum flow.

The Max-Flow Min-Cut Theorem

This is one of the most fundamental theorems in network flow, connecting the maximum flow in a network to the
minimum capacity of a cut.

• S-T Cut: An s-t cut (S,T) of a flow network G=(V,E) is a partition of the vertices V into two sets S and T=V∖S,
such that the source s∈S and the sink t∈T.

• Capacity of a Cut: The capacity of an s-t cut (S,T), denoted c(S,T), is the sum of the capacities of all edges that
go from a vertex in S to a vertex in T: c(S,T)=u∈S,v∈T,(u,v)∈E∑c(u,v)

Max-Flow Min-Cut Theorem: The maximum value of an s-t flow is equal to the minimum capacity of an s-t cut in the
graph.

This theorem is incredibly powerful because it provides a duality: solving a maximization problem (max flow) is
equivalent to solving a minimization problem (min cut).

Key Algorithms for Maximum Flow

The most famous and foundational algorithm for finding the maximum flow is:

1. Ford-Fulkerson Method

The Ford-Fulkerson method is a framework that repeatedly finds an augmenting path from the source to the sink in
the residual graph and increases the flow along this path until no more augmenting paths can be found.

• Residual Graph (Gf): For a given flow f in G, the residual graph represents the "remaining capacity" in the
network.

o For an edge (u,v) with capacity c(u,v) and current flow f(u,v):

▪ Forward edge: If f(u,v)<c(u,v), there's a forward edge (u,v) in Gf with residual capacity cf
(u,v)=c(u,v)−f(u,v). This allows us to push more flow from u to v.

▪ Backward edge: If f(u,v)>0, there's a backward edge (v,u) in Gf with residual capacity cf
(v,u)=f(u,v). This allows us to "undo" flow from u to v (effectively pushing flow from v to u),
which can be necessary to find better paths.

• Augmenting Path: A path from s to t in the residual graph. The bottleneck capacity of this path is the
minimum residual capacity of any edge on the path. This is the maximum amount of flow we can "push"
along this path.

Ford-Fulkerson Algorithm Steps:

1. Initialize Flow: Set the flow f(u,v)=0 for all edges (u,v)∈E.

2. While loop: While there exists an augmenting path P from s to t in the current residual graph Gf:

o Find the bottleneck capacity (or residual capacity) δ of path P. This is the minimum cf(u,v) for all
edges (u,v) in P.

o Augment Flow: For each edge (u,v) in P:


▪ If (u,v) is a forward edge: f(u,v)=f(u,v)+δ.

▪ If (u,v) is a backward edge: f(v,u)=f(v,u)−δ. (Equivalent to saying f(u,v)=f(u,v)−δ if you're


thinking of f(u,v) as a positive flow and we're reducing it).

o Update the residual graph Gf based on the new flow.

3. Result: The total flow accumulated is the maximum flow.

Pseudocode (Ford-Fulkerson with DFS/BFS for augmenting path):

FordFulkerson(Graph G, Source s, Sink t):

// Initialize flow to 0 on all edges

for each edge (u, v) in G.E:

flow[u][v] = 0

flow[v][u] = 0 // For residual graph back edges

max_flow = 0

while true:

// Find an augmenting path in the residual graph using DFS or BFS

// Returns (path, bottleneck_capacity) or (null, 0) if no path

path, bottleneck = FindAugmentingPath(G, flow, s, t)

if path is null: // No augmenting path found

break

// Augment flow along the path

max_flow += bottleneck

for i from 0 to len(path) - 2:

u = path[i]

v = path[i+1]

// If it's a forward edge

if G.has_edge(u, v): // Check if (u,v) is an original edge

flow[u][v] += bottleneck

flow[v][u] -= bottleneck // For skew symmetry and residual back-edge

else: // It must be a backward edge in the residual graph, meaning (v,u) is an original edge

flow[v][u] -= bottleneck
flow[u][v] += bottleneck // For skew symmetry and residual forward-edge

return max_flow

// FindAugmentingPath can be implemented using BFS (Edmonds-Karp) or DFS

// BFS is generally preferred for guaranteed termination and polynomial time

// BFS_FindAugmentingPath(G, current_flow, s, t):

// ... (standard BFS to find path, store parent pointers to reconstruct path)

// ... (calculate bottleneck capacity along the found path)

// return (path, bottleneck)

Complexity of Ford-Fulkerson: The complexity depends heavily on how FindAugmentingPath is implemented. In its
most general form (using arbitrary DFS), Ford-Fulkerson is not polynomial for graphs with irrational capacities.
However, for integer capacities, it is O(E⋅Fmax), where Fmax is the maximum flow value. This can be very slow if Fmax
is large.

2. Edmonds-Karp Algorithm

Edmonds-Karp is a specific implementation of the Ford-Fulkerson method where the augmenting path is always
found using Breadth-First Search (BFS). By using BFS, it guarantees that the shortest augmenting path (in terms of
number of edges) is found in each iteration. This seemingly small change leads to a much better worst-case time
complexity.

Edmonds-Karp Algorithm Steps:

Identical to Ford-Fulkerson, but explicitly states that FindAugmentingPath must use BFS.

Complexity of Edmonds-Karp: O(VE2). This is a polynomial time complexity and makes it efficient for many practical
problems.

3. Dinic's Algorithm (Dinic / Dinic's Algorithm)

Dinic's algorithm is a more advanced and generally much faster algorithm for max flow, especially for dense graphs. It
significantly improves upon Edmonds-Karp by finding multiple augmenting paths simultaneously in layers of the
residual graph.

Key Concepts:

• Level Graph (or Layered Network): A directed acyclic graph (DAG) constructed from the residual graph using
BFS. It only contains edges (u,v) where the shortest path distance from s to v is exactly one greater than the
shortest path distance from s to u.

• Blocking Flow: A flow in the level graph such that every s-t path in the level graph contains at least one
saturated edge (an edge whose residual capacity becomes 0). Dinic's finds a blocking flow in each phase.

Dinic's Algorithm Steps:

1. Initialize Flow: Set flow f(u,v)=0 for all edges.

2. While loop: While a path exists from s to t in the residual graph:


o Construct Level Graph: Perform a BFS from s in the residual graph to determine the "level" (shortest
path distance from s) of each reachable vertex. If t is not reachable, break.

o Find Blocking Flow: Perform multiple DFS traversals from s in the level graph to find augmenting
paths. In each DFS, push as much flow as possible along the path (saturating edges) and backtrack.
Crucially, a DFS path never uses an edge that goes "down" a level (from level i to level i−1). Also, once
an edge is saturated in a DFS, it's removed (or its capacity becomes 0), so subsequent DFS traversals
won't use it in that phase. This step is optimized by using "dead ends" and avoiding re-exploration of
paths that cannot lead to t.

o Add the flow found in this phase to the total maximum flow.

3. Result: The total accumulated flow is the maximum flow.

Complexity of Dinic's:

• General graphs: O(V2E)

• Unit capacity networks (where all capacities are 1): O(min(V2/3,E1/2)E) (very efficient)

• Bipartite matching (a special case of max flow): O(EV ) (extremely efficient)

Applications of Network Flow Algorithms

Network flow algorithms are incredibly versatile and can model a wide range of problems:

• Bipartite Matching: Finding the maximum number of pairs in a bipartite graph.

• Assignment Problems: Assigning tasks to workers to minimize cost or maximize output.

• Project Scheduling: Dependencies between tasks.

• Image Segmentation: Graph cuts are often used in computer vision.

• Airline Scheduling: Assigning crews to flights.

• Minimum Cost Flow: Extending max flow to consider edge costs for flow (e.g., sending required flow at
minimum cost).

• Circulation Problems: Ensuring flow conservation in a network with lower bounds on capacities.

Network flow is a rich area, and understanding Max-Flow Min-Cut and algorithms like Edmonds-Karp and Dinic's is
fundamental for solving many complex optimization problems.
3. (a) Compare and contrast dynamic programming and greedy programming strategies. (5 marks)

Feature Dynamic Programming Greedy Programming

Solves subproblems and stores their solutions to Makes a locally optimal choice at each step
Approach avoid re-computation. Builds the solution bottom-up with the hope that these choices will lead to a
or top-down (with memoization). globally optimal solution.

Often used for optimization problems that exhibit Used for optimization problems that exhibit
Problem Type overlapping subproblems and optimal greedy choice property and optimal
substructure. substructure.

Does not necessarily solve all subproblems; it


Solves all necessary subproblems, even if they are
Subproblems only explores the path that seems best at the
not directly on the path to the optimal solution.
moment.

Does not always guarantee an optimal


Guaranteed Guarantees an optimal solution if the problem
solution. It works for a specific class of
Opt. satisfies the two properties.
problems.

Can be more complex and sometimes have higher


Generally simpler and faster than dynamic
Complexity time/space complexity due to storing subproblem
programming, if applicable.
solutions.

Fibonacci sequence, shortest path (Bellman-Ford,


Activity Selection, Huffman Coding,
Example Floyd-Warshall), Knapsack (0/1), Matrix Chain
Kruskal's/Prim's MST, Fractional Knapsack.
Multiplication.

Export to Sheets

Contrast:

• Dynamic programming explores multiple possibilities by solving overlapping subproblems and combining
their results to find the overall optimum. It's a more exhaustive approach.

• Greedy programming makes a single, irreversible choice at each step, hoping it will lead to the best overall
outcome. It's a more myopic approach.

3. (b) Compare and contrast Branch and Bound method and backtracking method. (5 marks)

Feature Branch and Bound Backtracking

Typically used for optimization problems (finding the Typically used for finding all solutions or a
Goal
best solution among many). feasible solution for a decision problem.

Uses bounding functions to prune branches that Prunes branches when it determines that the
Pruning cannot lead to a better solution than the current best- current partial solution cannot lead to a
known solution. valid/complete solution.

Often uses a breadth-first search (BFS) or least-cost


Search Typically uses a depth-first search (DFS)
search strategy, but can also be DFS. Prioritizes
Order approach.
exploring promising nodes.

State Explores the state space tree, often maintaining a Explores the state space tree, trying to build a
Space lower/upper bound on the optimal solution. solution incrementally.

Solution Aims to find the global optimum. Aims to find any (or all) valid solutions.
Traveling Salesperson Problem (TSP), 0/1 Knapsack N-Queens, Sudoku Solver, Subset Sum (for finding
Example
(for finding optimal), Job Assignment. a subset), Hamiltonian Cycle.

Export to Sheets

Contrast:

• Objective: Branch and Bound is inherently focused on optimization by finding the best solution, while
Backtracking is more general, often used for finding any (or all) valid solutions.

• Pruning Mechanism: Branch and Bound uses calculated bounds to discard suboptimal paths. Backtracking
prunes when a path is determined to be invalid or incomplete.

• Search Strategy: While both explore a state-space tree, Branch and Bound often uses a more informed
search (like least-cost) to speed up finding the optimum, whereas Backtracking is typically a pure DFS.

3. (c) What do you understand by Heuristic characteristics? Also write their application domains. (5 marks)

Heuristic Characteristics:

A heuristic is a technique designed for solving a problem more quickly when classic methods are too slow, or for
finding an approximate solution when finding an exact solution is too computationally expensive or impossible.
Heuristics are not guaranteed to be optimal or even correct, but they are often good enough and practical.

Key characteristics of heuristics:

• Not guaranteed optimality: A heuristic does not guarantee finding the best (optimal) solution.

• Not guaranteed correctness: It might not always find a correct solution, especially in complex or adversarial
scenarios.

• Speed/Efficiency: The primary goal is to find a "good enough" solution quickly, especially for computationally
hard problems.

• Exploration: They guide the search process by making "educated guesses" or rules of thumb.

• Problem-specific: Heuristics are often designed for a particular problem domain and may not be easily
transferable.

• Approximation: They often provide approximate solutions to exact problems.

• Trade-off: There's a trade-off between the quality of the solution and the time/resources spent finding it.

Application Domains:

Heuristics are widely applied in various fields, especially in Artificial Intelligence and Operations Research:

1. Artificial Intelligence (AI):

o Search Algorithms: Used extensively in informed search algorithms like A* search, Best-First Search,
etc., to guide the search towards the goal in state-space graphs (e.g., in pathfinding, game playing
like chess).

o Expert Systems: Rules of thumb used by domain experts are often implemented as heuristics.

o Machine Learning: Heuristic functions can be used in feature selection, regularization, or even in
defining the loss functions.

2. Optimization Problems:

o Combinatorial Optimization: For NP-hard problems like the Traveling Salesperson Problem (TSP),
Vehicle Routing Problem (VRP), Knapsack Problem (for larger instances), heuristics (e.g., nearest
neighbor, genetic algorithms, simulated annealing) are used to find good approximate solutions.
o Scheduling: Allocating resources (e.g., jobs to machines, tasks to processors) efficiently.

3. Operations Research:

o Logistics and Supply Chain Management: Optimizing routes, inventory management, facility
location.

o Resource Allocation: Distributing limited resources effectively.

4. Computer Science:

o Compilers: Heuristics are used in compiler optimization (e.g., register allocation, instruction
scheduling).

o Operating Systems: Scheduling processes, memory management.

o Network Routing: Determining the best paths for data packets in a network.

o Cryptography: Design of cryptographic algorithms often involves heuristic choices.

You might also like