advgpcgen is a tool that generates Generalized Parallel Counters (GPC) from scratch, which serve as the core of multi-input adders in Xilinx FPGAs.
Addition of multiple values is used in almost all arithmetic operations, such as multiplication and multiply-accumulate operatoins. In ASICs, the method of constructing trees using full adders as the basic elements for multipliers has been known since 1960s. However, full adders do not fit in FPGA's LUTs and carry logic, which is not always efficient. Therefore, methods using adders expanded to have more inputs and outputs (parallel counter), or adders where each input has weights other than 1 (2,4,8,...) as basic elements have been proposed. Such expanded adders called generalized parallel counters (GPC), and the adder tree using GPCs are called compressor tree.
GPC is represented as follow:
For example, a full adder is represented as
So far, three types of GPCs know to be implementable in a single slice are
In this project, five new GPCs that implementable in single slice have been discovered:
The Verilog HDL implementations of the GPCs are located in the hdl directory.
They require LUT1~5, LUT6_2, and CARRY4 modules.
cargo build --releasesolve <shape>- Determines whether a GPC exits for the specified input shape, and if so, print it as JSON.
$ cargo run --release --bin solve 1334 > gpc1334.json
Finished `release` profile [optimized] target(s) in 0.01s
Running `target/release/solve 1334`
$ cat gpc1334.json
{"shape":[4,3,3,1],"lut":[[[1,2,3],null,644245094496],[[1,2,4,5,6],null,8685059358021126272],[[4,5,6,8,9],7,1722882046844934120],[[4,5,6,8,9],10,6500312741898240]],"cin":0}
$enum <width>- Enumerates all GPCs of the input width.
- The default width is 4.
$ cargo run --release --bin enum 2
...
[4, 7] total over
[5, 7] total over
[6, 7] total over
[7, 7] total over
max_feasibles
{"shape":[7,0],"lut":[[[2,3,4,5,6],1,7608434000728254870],[[2,3,4,5,6],null,1692930048736133120]],"cin":0}
{"shape":[5,1],"lut":[[[1,2,3,4],null,116092966049280],[[1,2,3,5],null,26285199910912]],"cin":0}
{"shape":[3,2],"lut":[[[1,2],null,25769803784],[[3,4],null,25769803784]],"cin":0}
{"shape":[1,3],"lut":[[[1],null,2],[[2,3],null,25769803784]],"cin":0}
min_infeasibles
$script/codegen.py <JSON>- Generates Verilog HDL module and testbench from JSON (generated by
enumorshape). - The testbench tries every input bits patterns.
- GPCs require modules that are logically equivalent to the intrinsics
LUT1,LUT2,LUT3,LUT4,LUT5,LUT6_2andCARRY4.- They are available in
hdl/env/lut.vandhdl/env/carry.v.
- They are available in
- Generates Verilog HDL module and testbench from JSON (generated by
$ script/codegen.py --help
usage: codegen.py [-h] [--test] [--avoidlsb7] source
positional arguments:
source A JSON file name of GPC specification.
options:
-h, --help show this help message and exit
--test, -t When this option represented, it generates a testbench for the represented JSON of GPC.
--avoidlsb7, -a When this option represented in the GPC generation mode (not in the testbench generation mode),
GPC that have 7 inputs at the least significant place and outputs of 4 digits or less, avoid
using LUTA to deal the lsb bits.
$ script/codegen.py gpc1334.json | tee gpc1334.v | tail
.CO(carryout[3:0]),
.O(out[3:0]),
.CYINIT(1'h0),
.CI(src0[0]),
.DI(gene[3:0]),
.S(prop[3:0])
);
assign dst = {carryout[3], out[3], out[2], out[1], out[0]};
endmodule
$ script/codegen.py gpc1334.json --test | tee gpc1334_test.v | tail
#1
{src3[0], src2[2], src2[1], src2[0], src1[2], src1[1], src1[0], src0[3], src0[2], src0[1], src0[0]} <= 11'h7fd;
#1
{src3[0], src2[2], src2[1], src2[0], src1[2], src1[1], src1[0], src0[3], src0[2], src0[1], src0[0]} <= 11'h7fe;
#1
{src3[0], src2[2], src2[1], src2[0], src1[2], src1[1], src1[0], src0[3], src0[2], src0[1], src0[0]} <= 11'h7ff;
#1
$finish();
end
endmodule
$ iverilog hdl/env/* gpc1334.v gpc1334_test.v -o gpc1334
$ ./gpc1334 | grep 'test:0'
$ # PASS{
"shape":[4,3,3,1],
"lut":[
[[1,2,3],null,644245094496],
[[1,2,4,5,6],null,8685059358021126272],
[[4,5,6,8,9],7,1722882046844934120],
[[4,5,6,8,9],10,6500312741898240]
],
"cin":0
}- The
shapefield represents number of inputs of each digit. - A GPC represented as the JSON in above has 4 input bits in least significant place.
- The
lutfield represents connection patterns from input bits to each LUT port and values of LUT truth tables. - This field consists of list of LUTs informations.
- Each LUT information consists of a tuple of 3 values.
- Symmetric ports (I0 ~ I4) input wire indices.
- Asymmetric ports (I5) input wire index.
- If this field is
null, I5 port is not used and set to 0.
- If this field is
- Values of each truth table memory.
- The upper 32 bits represent
propagateside. - The lower 32 bits represent
generateside.
- The upper 32 bits represent
- The
cinfield represents wire index provided to the CIN (CYINIT) of CARRY4. - If this field is
null, CIN will not used and set to 0.
- Mugi Noda
- GPLv3
- This does not applied to code generated by
enum,solveandscript.py. - Also not applied to Verilog HDL and JSON codes in the
hdldirectory.
- This does not applied to code generated by