6.
006- Introduction to
Algorithms
Lecture 12 – Graph Algorithms
Prof. Manolis Kellis
CLRS 22.2-22.3
Combinatorics
Ponytail
No
ponytail
Beard
Erik
?
No
Beard
Piotr
Manolis
Unit #4 – Games, Graphs, Searching, Networks
3
Unit #4 Overview: Searching
Today: Introduction to Games and Graphs
• Rubik’s cube, Pocket cube, Game space
• Graph definitions, representation, searching
Tuesday: Graph algorithms and analysis
• Breadth First Search, Depth First Search
• Queues, Stacks, Augmentation, Topological sort
Thursday: Networks in biology and real world
• Network/node properties, metrics, motifs, clusters
• Dynamic processes, epidemics, growth, resilience
Last time: Games and Graphs
Pocket Cube
• 2 × 2 × 2 Rubik’s cube
• Start with any colors
• Moves are quarter
turns of any face
• “Solve” by making
each side one color
Searching for a solution path
27 two-away
6 neighbors
1 turn
How big is the space?
• Graph algorithms allow us explore space
– Nodes: configurations
– Edges: moves between them
– Paths to ‘solved’ configuration: solutions
• G=(V,E)
Graphs • V a set of vertices
Usually number denoted by n
• E ⊆ V × V a set of edges (pairs of vertices)
Usually number denoted by m
Note m ≤ n(n-1) = O(n2)
Undirected example Directed example
b
a
a
d
c b c
• V={a,b,c,d}
• V = {a,b,c}
• E={{a,b}, {a,c}, {b,c},
{b,d}, {c,d}} • E = {(a,c), (a,b) (b,c), (c,b)}
Graph Representation
• Adjacency lists • Incidence lists
(a,c) (a,b) /
a c b / a
b c / b (b,c) /
c c (c,b) /
b /
• Adjacency matrix • Implicit representation
a (1) b (2) c (3)
a
Neighbors(a)
[c,b]
0 1 1 a
(1)
Neighbors(b)
[b]
0 0 1 b (2)
0 1 0 c (3)
Neighbors(c)
[b]
b
c
Today: Searching graphs
?
Searching Graph
• We want to get from current Rubik state to
“solved” state
• How do we explore?
Breadth First Search
• start with vertex v
• list all its neighbors (distance 1)
• then all their neighbors (distance 2)
• etc.
• algorithm starting at s:
define frontier F
initially F={s}
repeat F=all neighbors of vertices in F
until all vertices found
Depth First Search
• Like exploring a maze
• From current vertex, move to another
• Until you get stuck
• Then backtrack till you find a new place to
explore
• Exploring a maze • “left-hand” rule
How to handle cycles: BFS/DFS
• What happens if unknowingly revisit a vertex?
Will eventually happen if graph contains a cycle
• BFS: get wrong notion of distance
• DFS: may get in circles
• Solution: mark vertices
BFS: if you’ve seen it before, ignore
DFS: if you’ve seen it before, back up
Breadth First Search (BFS)
BFS algorithm outline
• Initial vertex s
Level 0
s
• For i=1,…
grow level i Level
3
Find all neighbors of level i-1
vertices
Level
2
(except those already seen)
Level
1
i.e. level i contains vertices
reachable via a path of i edges
and no fewer
BFS example
1 0 2
a s d f 3
1
2
z x c v 3
2
z
a
d
f
s
x
c
v
BFS algorithm outline
• Initial vertex s
Level 0
• For i=1,…
grow level i s
Find all neighbors of level i-1
(except those already seen) Level
3
i.e. level i contains vertices
reachable via a path of i edges Level
2
and no fewer Level
1
• Where can the other edges of the graph be?
They cannot jump a layer (otherwise v would be in Level 2)
But they can be between nodes in same or adjacent levels
The ‘frontier’ of BFS exploration
1 0 2
a s d f 3
1
2
z x c v 3
2
z
The
only
edges
a
d
f
not
traversed
by
BFS
s
link
verHces
x
c
v
within
the
same
level
BFS Algorithm
• BFS(V,Adj,s)
level={s: 0}; parent = {s: None}; i=1
frontier=[s] #previous level, i-1
while frontier
next=[] #next level, i
for u in frontier
for v in Adj[u]
if v not in level #not yet seen
level[v] = i #level of u+1
parent[v] = u
next.append(v)
frontier = next
i += 1
BFS Analysis: Runtime
• Naïve analysis: outer loop |V| * inner loop |V|
• Vertex v appears at the frontier at most once
Since then it has a level
And nodes with a level aren’t added again
Total time spent adding nodes to frontier O(n)
• Adj[v] only scanned once
Just when v is in frontier
Total time ∑vAdj[v]
| |
• This sum counts each “outgoing” edge
• So O(m) time spend scanning adjacency lists
• Total: O(m+n) time --- “Linear time”
For sparse graphs |V|+|E| is much better than |V|2
BFS Analysis: Correctness
i.e. why are all nodes reachable from s explored?
(we’ll actually prove a stronger claim)
• Claim: If there is a path of L edges from s to v,
then v is added to next when i=L or before
• Proof: induction
Base case: s is added before setting i=1
Inductive step when i=L:
• Consider path of length L from s to v
• This must contain: (1) a path of length L-1 from s to u
• (2) and an edge (u,v) from u to v
By inductive hypothesis, u was added to next
when i=L-1 or before
• If v has not already been inserted in next before i=L,
then it gets added during the scan of Adj[u] at i=L
So it happens when i=L or before. QED
Corrollary: BFSShortest Paths
• From correctness analysis, conclude more:
Level[v] is length of shortest sv path
• Parent pointers form a shortest paths tree
i.e. the union of shortest paths to all vertices
• To find shortest path from s to v
Follow parent pointers from v backwards
Will end up at s
Depth First Search (DFS)
DFS Algorithm Outline
• Explore a maze
Follow path until you get stuck
Backtrack along breadcrumbs till find new exit
i.e. recursively explore
DFS Algorithm
• parent = {s: None}
• call DFS-visit (V, Adj, s)
def DFS-visit (V, Adj, u)
for v in Adj[u]
if v not in parent #not yet seen
parent[v] = u
DFS-visit (V, Adj, v) #recurse!
DFS example run (starting from s)
1
(in
tree)
s
5
(forward
edge)
a
2
(in
tree)
3
(in
tree)
c
7
(cross
edge)
b
d
b
c
a
s
d
DFS Runtime Analysis
• Quite similar to BFS
• DFS-visit only called once per vertex v
Since next time v is in parent set
• Edge list of v scanned only once (in that call)
• So time in DFS-visit is:
1 per vertex + 1 per edge
• So time is O(n+m)
DFS Correctness?
• Trickier than BFS
• Can use induction on length of shortest path from
starting vertex
Inductive Hypothesis:
“each vertex at distance k is visited (eventually)”
Induction Step:
• Suppose vertex v at distance k.
Then some u at shortest distance k-1 with edge (u,v)
Can decompose into su at shortest distance k-1, and (u,v)
• By inductive hypothesis: u is visited (eventually)
• By algorithm: every edge out of u is checked
If v wasn’t previously visited, it gets visited from u (eventually)
Edge Classification
• Tree edge used to get to new child
• Back edge leads from node to ancestor in tree
• Forward edge leads to descendant in tree
• Cross edge leads to a different subtree
• To label what edge is of what type, keep global
time counter and store interval during which
vertex is on recursion stack
tree
edge
Back
edge
Cross
edge
Forward
edge
BFS vs. DFS
The ‘frontier’ of BFS exploration
1 0 2
a s d f 3
1
2
z x c v 3
2
z
The
only
edges
a
d
f
not
traversed
by
BFS
s
link
verHces
x
c
v
within
the
same
level
The tree of DFS exploration
1
(in
tree)
s
5
(forward
edge)
a
2
(in
tree)
3
(in
tree)
c
7
(cross
edge)
b
d
b
c
a
s
d
BFS/DFS Algorithm Summary
• Maintain “todo list” of vertices to be scanned
• Until list is empty
Take a vertex v from front of list
Mark it scanned
Examine all outgoing edges (v,u)
If u not marked, add to the todo list
• BFS: add to end of todo list (queue: FIFO)
• DFS: add to front of todo list (recursion stack: LIFO)
Data structures: Queues and Stacks
• BFS queue is explicit
Created in pieces
(level 0 vertices) . (level 1 vertices) . (level 2 vert…
the frontier at iteration i is piece i of vertices in queue
• DFS stack is implicit
It’s the call stack of the python interpreter
From v, recurse on one child at a time
But same order if put all children on stack, then pull
off (and recurse) one at a time
Runtime Summary
• Each vertex scanned once
When scanned, marked
If marked, not (re)added to todo list
Constant work per vertex
• Removing from queue
• Marking
O(n) total
• Each edge scanned once
When tail vertex of edge is scanned
Constant work per edge (checking mark on head)
O(m) total
• In all, O(n+m), linear in the ‘size’ of the graph
Back to our game graphs
So, how do we solve
the 2x2 Rubik’s cube?
Searching for a solution path
27 two-away
6 neighbors
1 turn
How big is the space?
• Graph algorithms allow us explore space
– Nodes: configurations
– Edges: moves between them
– Paths to ‘solved’ configuration: solutions
Tradeoffs and Applications
• BFS:
Solving Rubik’s cube?
BFS gives shortest solution
• DFS:
Robot exploring a building?
Robot can trace out the exploration path
Just drops markers behind
Unit #4 Overview: Searching
Today: Introduction to Games and Graphs
• Rubik’s cube, Pocket cube, Game space
• Graph definitions, representation, searching
Tuesday: Graph algorithms and analysis
• Breadth First Search, Depth First Search
• Queues, Stacks, Augmentation, Topological sort
Thursday: Networks in biology and real world
• Network/node properties, metrics, motifs, clusters
• Dynamic processes, epidemics, growth, resilience
Unit #4 – Games, Graphs, Searching, Networks
41