Lecture 25:
Placement and Routing
Slides courtesy of Deming Chen
Some slides courtesy of Prof. J. Cong and N. Carter
Outline
Overview
– The placement problem
– Partitioning-based placement
– The routing problem
– Global routing and detailed routing
Reference: Practical Problems in VLSI Physical
Design Automation, S. K. Lim, Springer, 2008.
ECE 425 Intro. VLSI System Design Slide 2
The Placement Problem
Placement is to find locations for the cells in a netlist
such that they
– Fit within the available space
– The router can make all of the necessary
connections between cells
Placement and routing are inherently connected
– Standard CAD tools do them in separate passes
to simplify the algorithms
– Algorithms that do placement and routing
simultaneously to improve performance are an
active research area
ECE 425 Intro. VLSI System Design Slide 3
What Makes a Placement Good?
Routability -- if you can’t route the required connections, design
won’t work
Performance -- placement affects wire delays, which affects
performance
Density -- what percentage of the available area can the placer
use while still maintaining routability
– ~70% for current tools, depending on design
Physical Issues -- May want to distribute power dissipation/heat
generation/clock load across the chip
Fortunately, designs that have good routability often have good
performance, because placing connected modules close
together improves both metrics
ECE 425 Intro. VLSI System Design Slide 4
Placement Methods
Constructive methods.
– Cluster growth algorithm
– Force-directed method
– Algorithm by Goto
– Min-cut based method
II. Iterative improvement approaches
– Pairwise exchange
– Simulated annealing- Timberwolf
– Genetic algorithm
III. Analytical methods
ECE 425 Intro. VLSI System Design Slide 5
Interconnection Cost
(a) Steiner Tree (b) Steiner Tree with Trunk (c) Minimum Spanning Tree
Rectilinear Length = 14 Rectilinear Length = 15 Rectilinear Length = 16
Approximation: half perimeter
of the bounding box
(d) Chain (e) Complete Graph
Rectilinear Length = 17 Rectilinear Length = 42
ECE 425 Intro. VLSI System Design Slide 6
Measure of Congestion (Routing
Area)
D B C A
E F G H
(a) Two Tracks Required. All connections routed
A B C D
E F G H
(b) Shorter Wire Length. Three Tracks Required.
A failure occurs if only two tracks are available
ECE 425 Intro. VLSI System Design Slide 7
Min-Cut Based Placement
“ A procedure for placement of standard-call VLSI circuits”
A.E. Dunlop, B.W. Kernighan, IEEE Trans. on CAD, vol CAD-4 No.1, Jan
1985
Min-cut with terminal propagation
½n ½n
minimize minimize
ECE 425 Intro. VLSI System Design Slide 8
Min-Cut Based Placement
(Cont’d)
This process continues until there are only a
few cells in each group( 6 )
each group
has 6 cells
Assign cells in each
group close together in
the same row or nearly
in adjacent rows
group: smallest partition
ECE 425 Intro. VLSI System Design Slide 9
Terminal Propagation
net s Prefer to have
all of them in R1
L1 L1 R1
R
L2 L2
R2
We should use the fact that s is in L1!
Fictitious cell
Assuming of net s
located at p p
p L1 R1 L1 R1
center L1 R1
L2 R2 L2 R2
L2 R2
higher cost lower cost
p will stay in R1 for the rest of partitioning
ECE 425 Intro. VLSI System Design Slide 10
Terminal Propagation
When not to use p to bias partitioning
p In this case, p should not be
L R used to bias the solution in
either direction
In general
p
p
h 1/3h 1/3h
Do not use p use p
Net s has cells in many groups:
P2
Minimum cost P2 should be ignored !
R
rectilinear sterner P3 too close to the partition line
tree L P3
Intro. VLSI System Design Slide 11
ECE 425
Terminal Propagation (Cont’d)
Terminal propagation reduce overall area by ~30%
Creating Rows
Row 1 C1 C2 C3 cells in C1 row1
Row 2
Row 3 cells in C3 row1
a
Row 4 cells in C2 C2 b
Row 1 Row 2 a+b=1
Choose a and b preferably to balance row to
balance row length (During re-arrangement )
ECE 425 Intro. VLSI System Design Slide 12
Creating Rows
1 1 1 1,2
1,2 1,2
1,2 2
2 2,3 2,3 a four-row
2,3
2,3 standard cell
3 3 3 design
3,4 3,4 3,4 3,4
4 4 4
4
5 5 4,5 4,5
5 5 5 5
Partitioning of circuit into 32 groups. Each group is
either assigned to a single row or divided into 2 rows
ECE 425 Intro. VLSI System Design Slide 13
Experimental Results
CMOS Chip with 453 nets and 412 cells
Manual Solution:
track density=147;
without terminal propagation: t.d.=313;
(t.d. reduced to 235 by iterative interchanges)
with terminal propagation: t.d.=186;
(t.d. reduced to 152 by iterative interchanges)\
CPU time=3230 secs VAX 11/780
Iterative Interchange Refinement is helpful
ECE 425 Intro. VLSI System Design Slide 14
Remarks on Min-Cut Based
Placement
Also implemented F-M partitioning method. Much faster
but solutions appeared to be not as good as K-L
1. Use S-A to do partitioning. Much slower. If restricted to
a reasonable CPU time, solutions are of similar quality
of those by F-M method. Easy to implement
2. Seeking an elegant way to force some cells to be in
particular positions
3. Investigate other algorithms for terminal propagation.
Terminal propagation is the bottleneck of CPU time
ECE 425 Intro. VLSI System Design Slide 15
The Routing Problem
Given a set of cells, a set of pin locations on each
cell, and a set of nets connecting cells, connect the
cell pins with wires to implement the nets
– May also want to achieve other goals:
• Minimize overall area
• Minimize length of each wire
This is a hard problem -- routing can consume more
area than actual cells
The routing problem on modern chips involves
managing several metal layers and assigning wires
to them
ECE 425 Intro. VLSI System Design Slide 16
Global Routing Formulation
Given (i) Placement of blocks/cells
(ii) channel capacities
Determine
Routing topology of each net
Optimize
(i) max # nets routed
(ii) min routing area
(iii) min total wirelength
In general cell design or standard cell
design, we are able to move blocks or
cell rows, so we can guarantee
connections of all the nets.
In gate array design, exceeding
channel capacity is not allowed.
ECE 425 Intro. VLSI System Design Slide 17
Routing
First, divide the chip into regions of the type that the
router can handle
Then, use a global router to assign each net to a set
of regions
Finally, use a detailed router to lay out wires within
the regions assigned to each net
ECE 425 Intro. VLSI System Design Slide 18
Routing Algorithms
Many different types of routers have been developed
ECE 425 Intro. VLSI System Design Slide 19
Global Routing
Can work on channel-graph
Edges in the channel graph represent channels or
subsections of channels
– Weights on edges represent the capacity of the
channel
Vertices represent intersections between channels
Goal of global router is to assign wires to channels
without exceeding any capacities
– Detailed router worries about wire position within
each channel
ECE 425 Intro. VLSI System Design Slide 20
One Example
ECE 425 Intro. VLSI System Design Slide 21
Global Routing Algorithms
The problem is to find a suitable path for each connection
through the weighted graph
Ideally, we would like to find minimum-cost trees, but:
– This is NP-complete
– Does not model net interactions, required to control
congestion.
Instead, heuristic techniques are used, usually consisting of
two phases:
– An Initial Global Route Technique: use shortest path
algorithms to find some routing (may be congested).
– An Iterative Improvement Global Route Technique: select
and re-route connections to reduce congestion.
ECE 425 Intro. VLSI System Design Slide 22
Detailed Routers
Once nets have been assigned to routing regions, use a
detailed router to find exact wire locations
We’ll look at three detailed routers
– Maze routers
– Line-probe routers
– Channel Routers
Maze and line-probe routers can handle arbitrary geometries
Channel routers can only handle rectangular channels
ECE 425 Intro. VLSI System Design Slide 23
Maze Routers
Oldest kind of router, first used for printed circuit boards
Operate on a grid graph, where the surface is represented as
an array of cells
ECE 425 Intro. VLSI System Design Slide 24
Maze Routers -- the Lee
Algorithm
Always finds a shortest path if one exits
Starting at the source node, explore out in a
breadth-first fashion, keeping a list of the leaf nodes
of the exploration tree
At each step, replace each leaf node with its
neighbors and update their distance to the source
node
Stop when you see the target cell (vertex)
Block cells used by this wire and go on to the next
ECE 425 Intro. VLSI System Design Slide 25
Lee Algorithm Illustration
ECE 425 Intro. VLSI System Design Slide 26
Issues with the Lee Algorithm
Lee algorithm is serial
– Only looks at one connection at a time
– Early connections can block the path that a later connection
needs
Uses lots of memory (proportional to number of cells)
Long search time
Improvements made by others:
– Akers developed a coding method to reduce memory use
– Soukup developed modified version that is 10-50 times
faster, at the cost of not always finding shortest paths
– Many people have looked at parallel versions
Because of the long run-times and memory requirements,
maze routers are mainly used today to cleanup-route any wires
that more specialized routers couldn’t route
ECE 425 Intro. VLSI System Design Slide 27
Line-Probe Routers
First proposed by Mikami/Tabuchi and Hightower
(independently)
Model routing surface as a list of lines
Basic idea: project line-probes from starting point until you
encounter an obstruction
ECE 425 Intro. VLSI System Design Slide 28
Line-Probe Routers
ECE 425 Intro. VLSI System Design Slide 29
Line-Probe Routers
ECE 425 Intro. VLSI System Design Slide 30
Channel Routers
By requiring that the routing region be a rectangle, we can reduce
routing effort
A channel is a rectangular routing region with no obstacles and with
pins only on two parallel sides
We’ll represent channels as follows:
ECE 425 Intro. VLSI System Design Slide 31
Channel Routers
Within a channel, the routing problem now becomes
simpler: ensure that the number of wires in the
channel doesn’t exceed its capacity at any point
One simple algorithm: the left-edge algorithm
1. Imposes simplifying restrictions
2. Each net can only use one trunk section
3. Trunks must be in one routing layer, branches in
another
4. With these constraints, the algorithm determines
the minimum number of tracks required to route
a set of wires
ECE 425 Intro. VLSI System Design Slide 32
Left-Edge Algorithm
1. Sort trunks by their left-edge coordinate
2. Place trunks in sequence, using free space before adding tracks
3. Add branches to the placed trunks
ECE 425 Intro. VLSI System Design Slide 33
Left-Edge Algorithm
Has problems if there are multiple terminals at one horizontal position
– In PCB environment, could handle this by jogging one of the
conflicting segments
This doesn’t work in an IC environment, making the problem NP-
complete
– Practical Left-Edge algorithms use heuristics that don’t guarantee
finding the absolute minimum number of channels
ECE 425 Intro. VLSI System Design Slide 34
Comparing Routing
Algorithms
ECE 425 Intro. VLSI System Design Slide 35
Summary
Both placement and routing are important for
interconnect optimization
– Placement, global routing, detailed routing
– Placement has an important impact on routing
Both placement and routing are NP-hard problems
still with ongoing research
Different algorithms offer different tradeoffs in terms
of performance, runtime and memory usage.
ECE 425 Intro. VLSI System Design Slide 36