[COSE213 Data Structures]
Lecture 17: Minimum Spanning Trees
Yuseok Jeon
ys_jeon@kore.ac.kr
1
Data Structures Contents
Lecture 3 Lecture 4
Lecture 1 Lecture 2 Lecture 5
Array and Linked Analysis of
Introduction C++ Stacks
Lists Algorithms
Lecture 7 Lecture 9
Lecture 6 Lecture 8 Lecture 10
Vector, List, and Priority Queues /
Queues Trees Maps – Hash Table
Sequence Heaps
Lecture 12 Lecture 15
Lecture 11 Lecture 13 Lecture 14
Binary Search Graph Basics, DFS,
Maps - Skip Lists (2,4) Trees Red-Black Trees
Trees and AVL BFS
Lecture 16 Lecture 17
Lecture 18 Lecture 19
Graph – Shortest Minimum Spanning
Sorting - Merge Sorting - Quick
Path Trees
Minimum Spanning Trees
Spanning subgraph
Subgraph of a graph G
containing all the vertices of G
ORD 10
Spanning tree
1
Spanning subgraph that is itself a PIT
(free) tree
Minimum spanning tree (MST)
DEN 6 7
9
Spanning tree of a weighted
3 DCA
graph with minimum total edge
weight 4 STL
• Applications 8 5 2
Communications networks
Transportation networks
DFW ATL
3
Cycle Property
Cycle Property:
f 8
Let T be a minimum spanning
tree of a weighted graph G 4
C 9
Let e be an edge of G that is not 6
2
in T and let C be the cycle 3 e
formed by e with T 8 7
For every edge f of C, weight(f) ≤
7
weight(e)
Proof: Replacing f with e yields
By contradiction
a better spanning tree
If weight(f) > weight(e) we can 8
f
get a spanning tree of smaller 4
weight by replacing e with f C 9
2 6
3 e
8 7
7
4
Partition Property
Partition Property: U V
f 7
Consider a partition of the vertices
4
of G into subsets U and V
Let e be an edge of minimum
9
2 5
weight across the partition 8
There is a minimum spanning tree 3
8 e
of G containing edge e
Proof: 7
Let T be an MST of G Replacing f with e yields
If T does not contain e, consider the another MST
cycle C formed by e with T and let f
be an edge of C across the partition U V
f 7
By the cycle property,
4
weight(f) ≤ weight(e)
9
Thus, weight(f) = weight(e) 5
2
We obtain another MST by
8
replacing f with e 8 e 3
7
5
Kruskal’s Algorithm
6
Kruskal’s Algorithm: Example
G G
B 8 8
4 B 4
9 E E
5 6 9 6
1 F 1 5 F
C 11 3 C 11 3
2 2
7 D H 7 H
D
A 10 A 10
G G
B 8 B 8
4 4
9 E 9 E
5 6 5 6
1 F 1 F
C 11 3 C 11 3
2 2
7 D H 7 H
D
A 10 A 10
7
Example (contd.)
G G
B 8 B 8
4 4
9 E 9 E
5 6 6
1 F 1 5 F
C 11 3 C 11 3
2 2
7 D H 7 H
D
A 10 A 10
four steps
G G
B 8 B 8
4 4
9 E 9 E
5 6 5 6
1 F 1 F
C 11 3 C 11 3
2 2
7 D H 7 D H
A 10 A 10
8
Kruskal’s Algorithm
Maintain a partition of the
vertices into clusters Algorithm KruskalMST(G)
Initially, single-vertex for each vertex v in G do
clusters Create a cluster consisting of v
Keep an MST for each let Q be a priority queue.
cluster Insert all edges into Q
T←∅
Merge “closest” clusters {T is the union of the MSTs of the clusters}
and their MSTs while T has fewer than n − 1 edges do
A priority queue stores the e ← Q.removeMin().getValue()
edges outside clusters [u, v] ← G.endVertices(e)
Key: weight A ← getCluster(u)
B ← getCluster(v)
Element: edge if A ≠ B then
At the end of the algorithm Add edge e to T
One cluster and one MST mergeClusters(A, B)
(if connected) return T
9
Data Structure for Kruskal’s Algorithm
• The algorithm maintains a forest of trees
• A priority queue extracts the edges by increasing weight
• An edge is accepted if it connects distinct trees
• We need a data structure that maintains a partition, i.e., a
collection of disjoint sets
To do this, we need a data structure for a set
These are covered in Ch. 11.4 (Page 533)
10
Set
• A set is a collection of unique elements; no duplicate
elements are allowed
• We represent a set by the sorted sequence of its elements
• The basic set operations:
union
intersection
subtraction
• We consider
Sequence-based implementation
11
Example: Storing a Set in a Sorted List
• We can implement a set with a list
• Elements are sorted according to some canonical ordering
• The space used is O(n)
Nodes storing set elements in order
List ∅
Set elements
12
Partitions with Union-Find Operations
• Partition: A collection of disjoint sets
• Partition ADT needs to support the following functions:
makeSet(x): Create a singleton set containing the element x and return
the position storing x in this set
Singleton: a set that contains exactly one element
union(A,B): Return the set A U B, destroying the old A and B
find(p): Return the set containing the element at position p
13
List-based Partition (1)
• Each set is stored in a sequence (e.g., list)
• Partition: A collection of sequences
• Each element has a reference back to the set
Operation find(u): takes O(1) time, and returns the set of which u is a
member.
Operation union(A,B): we move the elements of the smaller set to the
sequence of the larger set and update their references
Time for operation union(A,B) is min(|A|, |B|)
Worst-case: O(n) for one union operation
14
List-based Partition (2)
• What about “amortized analysis”? (Page 539)
• Clearly, n makeSet and n find operation O(n)
• Union operation
Each time we move a position from one set to another, the size of the new set
at least doubles
Thus, each position is moved from one set to another at most log n times
We assume that the partition is initially empty, there are O(n) different
elements referenced in the given series of operations. The total time for all
the union operations is O(n log n)
15
Partition-Based Implementation
• Partition-based version Algorithm KruskalMST(G)
of Kruskal’s Algorithm Initialize a partition P
for each vertex v in G do
Cluster merges as unions
P.makeSet(v)
Cluster locations as finds let Q be a priority queue.
Insert all edges into Q
T←∅
• Running time O((n + m) {T is the union of the MSTs of the clusters}
log n) while T has fewer than n − 1 edges do
PQ operations O(m log n) e ← Q.removeMin().getValue()
PQ initialization: O(mlog m) [u, v] ← G.endVertices(e)
For each while loop A ← P.find(u)
O(log m) = O(log n) B ← P.find(v)
UF operations O(n log n) if A ≠ B then
Add edge e to T
P.union(A, B)
return T
16
Prim-Jarnik’s Algorithm
17
Example
∞ 7
7 D 7 D
2 2
B 4 B 4
8 9 ∞ 5 9 ∞
2 5 F 2 5 F
C C
8 8
8 3 8 3
E E
A 7 A 7
0 7 0 7
7 7
7 D D
2 2 7
B 4 B 4
5 9 ∞ 9 4
2 5 F 5 5
C 2 F
8 C
3 8
8 8 3
E E
A 7 A
0 7 7 7
0
18
Example (contd.)
7
7 D
2
B 4
5 9 4
2 5 F
C
8
8 3
E
A 3 7
0 7
7 D
2
B 4
5 9 4
2 5 F
C
8
8 3
E
A 3
0 7
19
Prim-Jarnik’s Algorithm
• Similar to Dijkstra’s algorithm
• We pick an arbitrary vertex s and we grow the MST as a cloud of
vertices, starting from s
• We store with each vertex v a label d(v) representing the smallest
weight of an edge connecting v to a vertex in the cloud
c.f. Dijkstra’s :We store with each vertex v a label d(v) representing the
distance of v from s in the subgraph consisting of the cloud and its
adjacent vertices
• At each step:
We add to the cloud the vertex u outside the cloud with the smallest
distance label, d(u)
We update the labels of the vertices adjacent to u
20
Prim-Jarnik’s Algorithm (cont.)
• A heap-based adaptable
priority queue with location- Algorithm PrimJarnikMST(G)
aware entries stores the Q ← new heap-based priority queue
s ← a vertex of G
vertices outside the cloud for all v ∈ G.vertices()
Key: distance if v = s
v.setDistance(0)
Value: vertex else
Recall that method v.setDistance(∞)
replaceKey(l,k) changes the v.setParent(∅)
l ← Q.insert(v.getDistance(), v)
key of entry l v.setLocator(l)
while ¬Q.empty()
l ← Q.removeMin()
• We store three labels with u ← l.getValue()
each vertex: for all e ∈ u.incidentEdges()
Distance z ← e.opposite(u)
r ← e.weight()
Parent edge in MST if r < z.getDistance()
Entry in priority queue z.setDistance(r)
z.setParent(e)
Q.replaceKey(z.getEntry(), r)
21
Prim-Jarnik’s vs. Dijkstra’s algorithm
Algorithm PrimJarnikMST(G) Algorithm DijkstraDistances(G, s)
Q ← new heap-based priority queue Q ← new heap-based priority queue
s ← a vertex of G for all v ∈ G.vertices()
for all v ∈ G.vertices() if v = s
if v = s v.setDistance(0)
v.setDistance(0) else
else v.setDistance(∞)
v.setDistance(∞) l ← Q.insert(v.getDistance(), v)
v.setParent(∅)
v.setEntry(l)
l ← Q.insert(v.getDistance(), v)
v.setLocator(l) while ¬Q.empty()
while ¬Q.empty() l ← Q.removeMin()
u ← l.getValue() // take out the closest node
l ← Q.removeMin()
for all e ∈ u.incidentEdges() { relax e }
u ← l.getValue() z ← e.opposite(u)
for all e ∈ u.incidentEdges() r ← u.getDistance() + e.weight()
z ← e.opposite(u)
r ← e.weight() if r < z.getDistance()
if r < z.getDistance() z.setDistance(r)
z.setDistance(r) Q.replaceKey(z.getEntry(), r)
z.setParent(e)
Q.replaceKey(z.getEntry(), r)
22
Analysis
Graph operations
Method incidentEdges is called once for each vertex
Label operations
We set/get the distance, parent and locator labels of vertex z O(deg(z)) times
Setting/getting a label takes O(1) time
Priority queue operations
Each vertex is inserted once into and removed once from the priority queue,
where each insertion or removal takes O(log n) time (total n O(log n))
The key of a vertex w in the priority queue is modified at most deg(w) times, where
each key change takes O(log n) time (total m log n)
Prim-Jarnik’s algorithm runs in O((n + m) log n) time provided the graph is
represented by the adjacency list structure
Recall that Σv deg(v) = 2m
The running time is O(m log n) since the graph is connected
23
Questions?