We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 24
PGA Architecture
A Classical FFGA, contains three major parts:
logic modules called configurable logic
blocks, ClBs), routing resources, and
input/output (I/O) cells .
Q The logic modules consist of combinational
and sequential circuits to implement desired
logic functions
Q The routing resources are composed of pre- ;J== Oat et eet ee)
fabricated wire segments ond programmable fC) NCI] folod
switches, where “the ‘wing “among ocic FC at at for
modules and I/O cells is user-programmabie Co: iolosolololoo
Iyfction of horizontal channel and 0 ovovoroyorod Quo
vertical one Is referred to as a switch box
§=0=0=0=0=0= DcL8
0 OLOFO LONG Oss
/a chcuit is realized in an FPGA by partitioning JUL J
‘and packing its logic into’ individual logic i
modules and. then interconnecting the (a) Classical FPGA
ing to connect wire segments, which
quires programmable switches “inside “a
modules by wire segments with appropriately
programmed switches
ich box.= ALB mainly consists of flip-flops (FFs),
lookup tables (LUTs), and multiplexers
\UXs).
he FF is used fo store the state
| information of a circuit.
Lookup Table (LUT)
= A Lookup Table, is an actual table that
generates an output based on the
inputs. Here is an example for a lookup
table that is implementing the function
of apAND gate. A
‘onsider this table is stored in a small
RAM. Inputs A and B are the address
pins and C is the data pin. Every time
your address pins are changing they
are pointing at a different address
entry and they are "reading out" the
result which is 0 or | based on the
inputs
Output CFPGA CAD Flow
(Circuit description (VHDL, schematic, ...))
Synthesize to logic blocks
Place logic blocks in FPGA
Route connections between logic blocks
FPGA programming file
Physical designPlacement?
@=Placement is the process of
placing the cells,or searching
appropriate location within the
chip floor plan for each cell in
the netlist.Place-and-route
® Place-and-route (P&R) describes processes where the netlist elements are
physically places and mapped to the FPGA physical resources, fo create a
file that can be downloaded in the FPGA chip.
™ P&R can take a few seconds for a small FPGA, or a few hours for a big one.
= P&Ris always done by the FPGA software from the FPGA vendor, because
FPGA vendors do not publish enough information about the intemals of
their devices to allow any other company to create P&R software.
= After synthesis and P&R, you have a binary file that is ready to be
"downloaded" into the FPGA.> Design Implementation
® Partitioning:Breaking up the design to fit in to the
CLBs[LUTs & FFs]
™ Placement:Choosing the right CLB based on its
proximity to the |OBs,delay constraints,area
constraints etc & minimize wires
= Routing:Choice of Interconnectbased on the timing
contraintsCircuit Compilation
I. Partitioning / Technology Mapping
2. Placement.
Ooo
Cet Assign a logical
OO Ger LUT to a physical
Cet
Oo
location.
3.Routing
Select wire segments
And switches for
Interconnection.Place & route in Xilinx: example
™ After the logic circuit has been fully designed and merged into one circuit,
it is translated into a special format that is understood by Xilinx CAD tools.
This format is called the Xilinx Netlist Format, or XNF.
= The XNF circuit is next partitioned into Xilinx Logic Cells. Partition is also
called technology mapping. Technology mapping converts the XNF
circuit, which is a netlist of basic logic gates, into a netlist of Xilinx Logic
Cells.
® The Logic Cell used depends on which Xilinx product the circuit is to
implemented in. The mapping procedure attempts to optimize the
resulting circuit, either to minimize the total number of Logic Cells required,
or the number of stages of Logic Cells in time-critical circuitry.»> FPGA placement
FPGA placement Placement is the process of finding a suitable physical location for each
cell in the block
Placement and routing is an interconnecting stage in the design of PCB, integrated circuits, and
FPGAs. Placement is a process that decides where to place all electronic components, circuitry, and
logic elements in a generally limited amount of space. Then the following routing process decides the
routes of all the wires needed to connect the placed components.
Placement does not just place the standard cell available in the synthesized netlist, it also
optimizes the design.
oor Placement requires large area & results in performance degradation.
Objective of Placement in FPGA:
Minimize the required wiring (wire-length driven placement) [Congestion].
Q Balance the wiring density across the FPGA (routability-driven placement).
O Maximize circuit speed (timing-driven placement).
Minimize power dissipation
O Minimize crosstalk between signals & net delays> An example of a good placement (left) versus a bad placement (right). In
the good placement the wires are shorter and there is almost no
congestion. In this example, the area of both placements is the same.Placement algorithms can be classified into three
major categories:
> Partitioning-based method[Min-cut Partitioning]
>» Simulated annealing
> Analytical placement.1. PLACEMENT BY PARTITIONING
Steps: Partitioning-based method[Min-cut
Partitioning]
.
The given circuit is subcircuits.
Meanwhile, the chip area is partitioned altemate e horizontal
and vertical directions into subsections.
Each subcircuit is assigned to a subsection.
> The process is repeated until each subcircuit consists of a single
logic block and has a unique location on the chip area.
During partitioning, the number of nets that are cut by the partition
is usually minimized based on the intuition that densely connected
subcircuils should be placed closely
v
i
vPartition Algorithms
+ A commonly used partition approach is the iterative improvement
approach that makes local changes to an initial partition. K-means
clustering . K-means reassigns each data in the dataset to only one of
the new clusters formed
+ AKemighan and Lin (K-L) proposed a graph bisectioning algorithm which
starts with a random initial partition and then uses pairwise swapping of
vertices between partitions, until no improvement is possible.
+ A simple greedy algorithm will accept a move only if it provides
immediate benefit, but this often leads to a local minimum from which
the algorithm cannot escape.
The K-L algorithm was designed to be able to escape from local minimaTypes of Min -cut algorithm
1.Rein-Cut Placement Algorithm: a
i. Begin by cutting a series of cut lines around the entire
chip.
ii, Allow the chip to be partitioned into two blocks using the
rst cut line. Additionally, divide the circuit into two sub-
frcuits to reduce the net cut. Parifion all of the blocks
intersected by the second cut line, and then partition the
circuit in the same way.
Carry on in this manner with all cut lines.
We must partition blocks A and B generated by cl
simultaneously while processing cut line C2Block-Oriented Min-Cut Placement
Algorithm:
|. We choose a cut line to divide the
chip into two regions in_ this
lgorithm.
Then, for each region, we choose a
separate cut line and partition the
regions further.
. This procedure is repeated until
each block has just one slot.Quadrature Placement Procedure:
|. The partitioning procedure is carried out
breadth-first, with alternate vertical and
horizontal cuts in this algorithm.
Il, Aregion is divided into two equal subregions
/ with each cut. Where there is a high routing
density in the middle, this approach is
appropriate.
A,
The congestion in the middle is minimized by
first cutting through the center and minimizing
the net cut.
IV. This is the most famous cut line sequence for
min-cut algorithms .Bisection Placement Procedure:
The chip is bisected (divided into two equal
subregions) by horizontal cut lines until
each subregion consists of one row in this
process.
Each element is assigned to a row in this
method, but its location is not fixed.
Then each row is bisected again and again
until each subregion has just one space
6a 5a 6b 4
6c 5b 6dSlice Bisection Procedure:
|. Rows are sliced by horizontal cut
lines.
Il. This procedure is repeated until all
of the modules have been
allocated to a row.
ll” The modules in each row are then
bisected and allocated
columns using vertical cut lines
10a 9a 10b 8 10c 9b 10d
1 OO RoR=> Simulated Annealing
The name came from heating and cooling metal to increase the size and quality of
| its crystals to make it harder/stronger
Find the best placement at low cost
The basic idea of simulated annealing is to search for a configuration with low cost
by iteratively moving from the current configuration to a neighbor configuration. If
the cost of the neighbor configuration is lower than that of the current
ration, the move will be taken. Otherwise (i.e., the move causes an increase
cost), the move may still be taken with a probability that is decreasing over
according to a cooling schedule. This probabilistic move helps the search
cedure to get out of a local minimum.
mulated annealing based algorithms are usually comparatively slow, especially
for large problem instances. The main challenge of designing a simulated
annealing based algorithm is to make it efficient without compromising the solution
quality=> Simulates annealing process for placement
Dinitial placement -Random
positions Jigsaw puzzles — Intuitive usage of Simulated
Annealing
O Perturb by block exchanging
+ Given jigsaw puzzle such
that one has to cain the
Dif Acost < 0, accept final shape using al pieces
together:
‘Starting witha random
Dothetwise, may still accept if cost eS atiguinioes oa eae
incyéase is not too much - Brain mncondtonly
remental improvements can be fend othe sokiion.
ade by exchanging Seren reso ta
‘may or may not lead tothe
solution are accepted or
‘jected with a certain small
probability
+ The final shape is obtained
as a result of a large number
Moves which decrease cost are accepted-> of iterations.
Just ike solvina a jiasaw
/blocks displacing blocks etc.Simulated annealing based placement algorithm was popularized in the mid1980s by the
TimberWolf place-and-route package .
The TimberWolf standard-cell placement algorithm consists of two stages.
Stage 19/dllows overlaps among cells and movement of cells between rows.
Stage/2 eliminates all overlaps and only performs interchange of adjacent cells.
The best algorithms for placement on FPGAs are based on Simulated Annealing
However, Simulated Annealing is not scalable due to the long convergence timeAnalytical method
Analytic placers make use of numerical technique to compute the
“coordinates” of the logic blocks. Itis offen based on a quadratic wirelength
objective.
. Ifa connectivity matrix represents the connection between the logic blocks of
the circuit, then minimizing the quadratic wirelength is to minimize the
quadratic wirelength
Numerical technique is well-suited for optimization of the above form of3 Stages in FPGA Placement
Modern FPGA placement generally consists of three major stages: global
placement, legalization, and detailed placement.
Global placement computes the best position for each module to minimize some
cost metric (e.g.. wirelength), ignoring the module non-overlapping constraint.
Legalization then removes module overlaps and places the modules into their
corresponding sites to minimize the displacement from global placement
etalled placement further refines the solution quality.FPGA-Specific Placement Issues
Q The number of routing tracks in routing channels are fixed on a
FPGA.
Q A necessary condition for any feasible placement solution is the
channel density in every channel cannot exceed the number of
routing tracks available in the channel.
Q In order to calculate the channel density accurately, some SA-
based placement algorithms also perform global routing
iteratively for every move.
QA FPGA contains routing tracks of various lengths. Simple
interconnection delay estimation model based on net length are
not accurate enough for use in timing-driven FPGA placement
algorithms.
Q Fast and accurate interconnection delay computation methods
are needed for timing driven FPGA placement