0% found this document useful (0 votes)
134 views24 pages

Placement and Partitioning

a document explaining partitioning and Placement in FPGA

Uploaded by

shreyach367
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
134 views24 pages

Placement and Partitioning

a document explaining partitioning and Placement in FPGA

Uploaded by

shreyach367
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 24
PGA Architecture A Classical FFGA, contains three major parts: logic modules called configurable logic blocks, ClBs), routing resources, and input/output (I/O) cells . Q The logic modules consist of combinational and sequential circuits to implement desired logic functions Q The routing resources are composed of pre- ;J== Oat et eet ee) fabricated wire segments ond programmable fC) NCI] folod switches, where “the ‘wing “among ocic FC at at for modules and I/O cells is user-programmabie Co: iolosolololoo Iyfction of horizontal channel and 0 ovovoroyorod Quo vertical one Is referred to as a switch box §=0=0=0=0=0= DcL8 0 OLOFO LONG Oss /a chcuit is realized in an FPGA by partitioning JUL J ‘and packing its logic into’ individual logic i modules and. then interconnecting the (a) Classical FPGA ing to connect wire segments, which quires programmable switches “inside “a modules by wire segments with appropriately programmed switches ich box. = ALB mainly consists of flip-flops (FFs), lookup tables (LUTs), and multiplexers \UXs). he FF is used fo store the state | information of a circuit. Lookup Table (LUT) = A Lookup Table, is an actual table that generates an output based on the inputs. Here is an example for a lookup table that is implementing the function of apAND gate. A ‘onsider this table is stored in a small RAM. Inputs A and B are the address pins and C is the data pin. Every time your address pins are changing they are pointing at a different address entry and they are "reading out" the result which is 0 or | based on the inputs Output C FPGA CAD Flow (Circuit description (VHDL, schematic, ...)) Synthesize to logic blocks Place logic blocks in FPGA Route connections between logic blocks FPGA programming file Physical design Placement? @=Placement is the process of placing the cells,or searching appropriate location within the chip floor plan for each cell in the netlist. Place-and-route ® Place-and-route (P&R) describes processes where the netlist elements are physically places and mapped to the FPGA physical resources, fo create a file that can be downloaded in the FPGA chip. ™ P&R can take a few seconds for a small FPGA, or a few hours for a big one. = P&Ris always done by the FPGA software from the FPGA vendor, because FPGA vendors do not publish enough information about the intemals of their devices to allow any other company to create P&R software. = After synthesis and P&R, you have a binary file that is ready to be "downloaded" into the FPGA. > Design Implementation ® Partitioning:Breaking up the design to fit in to the CLBs[LUTs & FFs] ™ Placement:Choosing the right CLB based on its proximity to the |OBs,delay constraints,area constraints etc & minimize wires = Routing:Choice of Interconnectbased on the timing contraints Circuit Compilation I. Partitioning / Technology Mapping 2. Placement. Ooo Cet Assign a logical OO Ger LUT to a physical Cet Oo location. 3.Routing Select wire segments And switches for Interconnection. Place & route in Xilinx: example ™ After the logic circuit has been fully designed and merged into one circuit, it is translated into a special format that is understood by Xilinx CAD tools. This format is called the Xilinx Netlist Format, or XNF. = The XNF circuit is next partitioned into Xilinx Logic Cells. Partition is also called technology mapping. Technology mapping converts the XNF circuit, which is a netlist of basic logic gates, into a netlist of Xilinx Logic Cells. ® The Logic Cell used depends on which Xilinx product the circuit is to implemented in. The mapping procedure attempts to optimize the resulting circuit, either to minimize the total number of Logic Cells required, or the number of stages of Logic Cells in time-critical circuitry. »> FPGA placement FPGA placement Placement is the process of finding a suitable physical location for each cell in the block Placement and routing is an interconnecting stage in the design of PCB, integrated circuits, and FPGAs. Placement is a process that decides where to place all electronic components, circuitry, and logic elements in a generally limited amount of space. Then the following routing process decides the routes of all the wires needed to connect the placed components. Placement does not just place the standard cell available in the synthesized netlist, it also optimizes the design. oor Placement requires large area & results in performance degradation. Objective of Placement in FPGA: Minimize the required wiring (wire-length driven placement) [Congestion]. Q Balance the wiring density across the FPGA (routability-driven placement). O Maximize circuit speed (timing-driven placement). Minimize power dissipation O Minimize crosstalk between signals & net delays > An example of a good placement (left) versus a bad placement (right). In the good placement the wires are shorter and there is almost no congestion. In this example, the area of both placements is the same. Placement algorithms can be classified into three major categories: > Partitioning-based method[Min-cut Partitioning] >» Simulated annealing > Analytical placement. 1. PLACEMENT BY PARTITIONING Steps: Partitioning-based method[Min-cut Partitioning] . The given circuit is subcircuits. Meanwhile, the chip area is partitioned altemate e horizontal and vertical directions into subsections. Each subcircuit is assigned to a subsection. > The process is repeated until each subcircuit consists of a single logic block and has a unique location on the chip area. During partitioning, the number of nets that are cut by the partition is usually minimized based on the intuition that densely connected subcircuils should be placed closely v i v Partition Algorithms + A commonly used partition approach is the iterative improvement approach that makes local changes to an initial partition. K-means clustering . K-means reassigns each data in the dataset to only one of the new clusters formed + AKemighan and Lin (K-L) proposed a graph bisectioning algorithm which starts with a random initial partition and then uses pairwise swapping of vertices between partitions, until no improvement is possible. + A simple greedy algorithm will accept a move only if it provides immediate benefit, but this often leads to a local minimum from which the algorithm cannot escape. The K-L algorithm was designed to be able to escape from local minima Types of Min -cut algorithm 1.Rein-Cut Placement Algorithm: a i. Begin by cutting a series of cut lines around the entire chip. ii, Allow the chip to be partitioned into two blocks using the rst cut line. Additionally, divide the circuit into two sub- frcuits to reduce the net cut. Parifion all of the blocks intersected by the second cut line, and then partition the circuit in the same way. Carry on in this manner with all cut lines. We must partition blocks A and B generated by cl simultaneously while processing cut line C2 Block-Oriented Min-Cut Placement Algorithm: |. We choose a cut line to divide the chip into two regions in_ this lgorithm. Then, for each region, we choose a separate cut line and partition the regions further. . This procedure is repeated until each block has just one slot. Quadrature Placement Procedure: |. The partitioning procedure is carried out breadth-first, with alternate vertical and horizontal cuts in this algorithm. Il, Aregion is divided into two equal subregions / with each cut. Where there is a high routing density in the middle, this approach is appropriate. A, The congestion in the middle is minimized by first cutting through the center and minimizing the net cut. IV. This is the most famous cut line sequence for min-cut algorithms . Bisection Placement Procedure: The chip is bisected (divided into two equal subregions) by horizontal cut lines until each subregion consists of one row in this process. Each element is assigned to a row in this method, but its location is not fixed. Then each row is bisected again and again until each subregion has just one space 6a 5a 6b 4 6c 5b 6d Slice Bisection Procedure: |. Rows are sliced by horizontal cut lines. Il. This procedure is repeated until all of the modules have been allocated to a row. ll” The modules in each row are then bisected and allocated columns using vertical cut lines 10a 9a 10b 8 10c 9b 10d 1 OO RoR => Simulated Annealing The name came from heating and cooling metal to increase the size and quality of | its crystals to make it harder/stronger Find the best placement at low cost The basic idea of simulated annealing is to search for a configuration with low cost by iteratively moving from the current configuration to a neighbor configuration. If the cost of the neighbor configuration is lower than that of the current ration, the move will be taken. Otherwise (i.e., the move causes an increase cost), the move may still be taken with a probability that is decreasing over according to a cooling schedule. This probabilistic move helps the search cedure to get out of a local minimum. mulated annealing based algorithms are usually comparatively slow, especially for large problem instances. The main challenge of designing a simulated annealing based algorithm is to make it efficient without compromising the solution quality => Simulates annealing process for placement Dinitial placement -Random positions Jigsaw puzzles — Intuitive usage of Simulated Annealing O Perturb by block exchanging + Given jigsaw puzzle such that one has to cain the Dif Acost < 0, accept final shape using al pieces together: ‘Starting witha random Dothetwise, may still accept if cost eS atiguinioes oa eae incyéase is not too much - Brain mncondtonly remental improvements can be fend othe sokiion. ade by exchanging Seren reso ta ‘may or may not lead tothe solution are accepted or ‘jected with a certain small probability + The final shape is obtained as a result of a large number Moves which decrease cost are accepted-> of iterations. Just ike solvina a jiasaw /blocks displacing blocks etc. Simulated annealing based placement algorithm was popularized in the mid1980s by the TimberWolf place-and-route package . The TimberWolf standard-cell placement algorithm consists of two stages. Stage 19/dllows overlaps among cells and movement of cells between rows. Stage/2 eliminates all overlaps and only performs interchange of adjacent cells. The best algorithms for placement on FPGAs are based on Simulated Annealing However, Simulated Annealing is not scalable due to the long convergence time Analytical method Analytic placers make use of numerical technique to compute the “coordinates” of the logic blocks. Itis offen based on a quadratic wirelength objective. . Ifa connectivity matrix represents the connection between the logic blocks of the circuit, then minimizing the quadratic wirelength is to minimize the quadratic wirelength Numerical technique is well-suited for optimization of the above form of 3 Stages in FPGA Placement Modern FPGA placement generally consists of three major stages: global placement, legalization, and detailed placement. Global placement computes the best position for each module to minimize some cost metric (e.g.. wirelength), ignoring the module non-overlapping constraint. Legalization then removes module overlaps and places the modules into their corresponding sites to minimize the displacement from global placement etalled placement further refines the solution quality. FPGA-Specific Placement Issues Q The number of routing tracks in routing channels are fixed on a FPGA. Q A necessary condition for any feasible placement solution is the channel density in every channel cannot exceed the number of routing tracks available in the channel. Q In order to calculate the channel density accurately, some SA- based placement algorithms also perform global routing iteratively for every move. QA FPGA contains routing tracks of various lengths. Simple interconnection delay estimation model based on net length are not accurate enough for use in timing-driven FPGA placement algorithms. Q Fast and accurate interconnection delay computation methods are needed for timing driven FPGA placement

You might also like