Synthesis I
ROHIT KHANNA
Advanced Computing Training School (ACTS)
C-DAC, Pune
2015 Centre for Development of Advanced Computing
Synthesis
HDL Design Flow
HDL CODE
HDL Simulation
SYNTHESIS
Gate Level Simulation
PLACE & ROUTE
Post Layout Simulation
2015 Centre for Development of Advanced Computing
Synthesis
HDL Simulation
Test Inputs are applied to HDL code
module gate (input a , b,
output c);
assign c = a & b;
endmodule
B
C
Test Bench
2015 Centre for Development of Advanced Computing
Synthesis
Gate Level Simulation
Test Inputs are applied to gate Level Netlist
A
C
B
C
Test Bench
2015 Centre for Development of Advanced Computing
Synthesis
Post Layout Simulation
Test Inputs are applied to logic on device with delays
A
B
LUT
A
B
C
Test Bench
Interconnect + logic
delay
2015 Centre for Development of Advanced Computing
Synthesis
Comparing Synthesis and Simulation
Results
Timing Statements
HDLs
were initially meant for documentation and simulation
so many simulation constructs are not supported by synthesis
Wait
for statement(VHDL)
This statement halts the execution of process for a specified
time period.
This construct is not supported by Synthesis Tools
wait for 10 ns;
When used for synthesis, Tool will report Error.
2015 Centre for Development of Advanced Computing
Synthesis
Timing Statements
After
statement (VHDL) or # (verilog)
This statement delays the execution of statements by a
specified time.
This construct is ignored by Synthesis Tools
a<=b after 10 ns; (VHDL)
or
#10 a=b; (Verilog)
HDL Simulation and RTL Simulation results will not match
2015 Centre for Development of Advanced Computing
Synthesis
If-else and Case
Synthesis
tool supports both the statements with different
functionality.
Simulation
If-else
Case
results are identical for both.
statement generates Priority based structure.
statement generates Parallel Structure.
When
if-else ladder is large, it can result in slower circuits.
Simulation
and Synthesis results are not same.
2015 Centre for Development of Advanced Computing
Synthesis
If-else and Case
If-else
always @ (*)
if (sel==2b00)
op=a;
else
If (sel==2b01)
op=b;
else
If (sel==2b10)
op=c;
else
op=d;
case
always @ (*)
case (sel)
2b00 : op=a;
2b01 : op=b;
2b10 : op=c;
default : op=d;
endcase
2015 Centre for Development of Advanced Computing
Synthesis
If-else and Case
If-else
case
d
c
M
u
x
b
M
u
x
a
M
u
x
op
a
b
c
d
Mux
op
Sel=10
Sel=01
sel
Sel=00
IO delay depends upon sel line
IO delay is same
2015 Centre for Development of Advanced Computing
Synthesis
Initial Value
Dont
assign Initial values as they are ignored by synthesis
tools(as per synthesis standards).
This
is true for a ASIC. Synthesis tools for FPGA may not ignore
initial value.
The
functionally of Simulated design may not match with that
of Synthesized design.
Example
VHDL
Signal a : integer := 7;
Verilog
reg [3:0] a = 4b1011;
2015 Centre for Development of Advanced Computing
Synthesis
Sensitivity List
Incomplete
sensitivity list result in mismatch between synthesis
and simulation result.
Sensitivity
list is ignored by Synthesis Tools but not by Simulation
tool
Example
always @ (*)
if (sel==1)
op=a;
else
op=b;
Result
Synthesis : MUX
process (sel) begin
if (sel=1) then
op<=a;
else
op<=b;
end if; end process;
Simulation: wont be a Mux
2015 Centre for Development of Advanced Computing
Synthesis
Coding Guidelines
Use shorthand Expression
process (a , b)
begin
C(3)<= A(3) and B(3);
C(2)<= A(2) and B(2);
C(1)<= A(1) and B(1);
C(0)<= A(0) and B(0);
end process;
and (C[3],A[3],B[3]);
and (C[2],A[2],B[2]);
and (C[1],A[1],B[1]);
and (C[0],A[0],B[0]);
process (a , b)
begin
For i in 0 to 3 loop
C(i)<= A(i) and B(i);
end loop;
end process;
and [3:0] (C,A,B);
2015 Centre for Development of Advanced Computing
Synthesis
Avoid use of Buffer port
Buffer
ports are used when you want to read your output
port.
Buffer
are not considered good for design purpose.
User
should create dummy signal that if fed to input and
assigned to output port.
2015 Centre for Development of Advanced Computing
Synthesis
Avoid use of Buffer port
entity xor_feed is
port(a : in std_logic;
b : buffer std_logic);
end entity;
entity xor_feed is
port(a : in std_logic;
b : out std_logic);
end entity;
architecture ach of xor_feed is
begin
b <= a xor b;
end architecture;
architecture ach of xor_feed is
signal temp : std_logic;
begin
temp <= a xor temp;
b <= temp;
end architecture;
2015 Centre for Development of Advanced Computing
Synthesis
Unnecessary loop calculation
Avoid
placing non changing expression inside loop.
This
prevent tools from spending more time on optimizing
redundant logic
Example
for (i=0; i <5; i=i+1)
begin
--unchanging expression
a=b;
data[i]= din[i];
end
2015 Centre for Development of Advanced Computing
a=b;
for (i=0; i <5; i=i+1)
begin
data[i]= din[i];
end
Synthesis
Optimizing
Arithmetic Expressions
Optimizing Arithmetic Expressions
Synthesis Tools
tries to rearrange an expression in order to
achieve optimized implementation.
There
are three types of arithmetic optimization
Merging Cascaded Adders
Arranging Expression Trees
Sharing Common Expressions
2015 Centre for Development of Advanced Computing
Synthesis
Merging Cascaded Adders
Assume
that design has two cascaded adder and one of the
inputs is single bit.
module add ( input c,
input [1:0] a, b,
output [2:0] z);
wire [2:0] t;
assign t= a + b;
assign z= t + c;
endmodule
2015 Centre for Development of Advanced Computing
Synthesis
Merging Cascaded Adders
In
such a case tool will optimize design to one adder with
carry input (Full Adder)
assign t= a + b;
assign z= t + c;
C (carry input)
or
assign t= a + c;
assign z= t + b;
or
assign z= a + b + c;
2015 Centre for Development of Advanced Computing
+
Z
Synthesis
Determine Number of Adders
Apply
Merging Cascaded Adder concept and determine
numbers of adders required
module add ( input [1:0] a, b, d,
output [3:0] z);
wire [2:0] t;
assign t= a + d;
assign z= t + b;
endmodule
2 Adders since no 1-bit input
2015 Centre for Development of Advanced Computing
Synthesis
Arranging Expression Trees
Tools
tries to optimize expression in order to achieve speed
requirements by minimizing delay
The
rearrangement of adders should be done depending upon
the arrival time of each signal.
module add ( input [1:0] c, d,
input [1:0] a, b,
output [3:0] z);
assign z= a + b + c + d;
endmodule
2015 Centre for Development of Advanced Computing
Synthesis
Arranging Expression Trees
Tool
treats code as if brackets were present in HDL code as
given below
A
5 ns
Z = a + b + c + d;
+
5 ns
Z = ( (a + b ) + c ) + d;
+
5 ns
+
Z
2015 Centre for Development of Advanced Computing
Synthesis
Arranging Expression Trees
Adders
arrangement can be modified to achieve minimum delay
depending upon arrival time of signals and applying bracket
z= a + b + c + d; Assume that a, b, c and d arrive at same time
In
this case delay can be reduced if we replace 3 stage adder
to 2 stage adder.
This
can be achieved by applying brackets wisely.
2015 Centre for Development of Advanced Computing
Synthesis
Arranging Expression Trees
Z = (a + b ) + (c + d);
5 ns
+
5 ns
5 ns
+
Z
2015 Centre for Development of Advanced Computing
Synthesis
Arranging Expression Trees
Z = a + b + c + d;
Assume that a , b and d arrive at same time and c is the last one
to arrive
In
this case c should be added at last
Z = a + b + d + c;
Rearranging
Now a , b and d should be added at same time but ideally not
possible since adder consists of two inputs.
Z = ( ( a + b ) + d ) + c;
2015 Centre for Development of Advanced Computing
Synthesis
Arranging Expression Trees
B
A
5 ns
+
5 ns
Z = ( ( a + b ) + d ) + c;
or
Z = ( ( d + b ) + a ) + c;
or
Z = ( ( d + a ) + b ) + c;
C
5 ns
+
Z
2015 Centre for Development of Advanced Computing
Synthesis
Arranging Expression Trees
Arrangement
considering overflow in adders
Example 1
module add ( input [5:0] c,
//6-bits
input [3:0] a, b, //4-bits
output [6:0] z); //7-bits
wire [3: 0] t ;
// wire declared to store result of a and b
assign t= a + b;
assign z= t + c;
// 5-bit result truncated to 4-bit
endmodule
2015 Centre for Development of Advanced Computing
Synthesis
Arranging Expression Trees
Synthesis
Outcome
B [4-bits]
A[4-bits]
+
C [6-bits]
T [4-bits]
+
Z [7-bits]
2015 Centre for Development of Advanced Computing
Synthesis
Arranging Expression Trees
Example 2
module add ( input [5:0] c,
//6-bits
input [3:0] a, b, //4-bits
output [6:0] z);
//6-bits
assign z= a + b + c;
// No Temporary variable declared
endmodule
Tool
understands that sum of a and b may produce 5-bits result
so it will automatically use a temporary variable of 5-bits
2015 Centre for Development of Advanced Computing
Synthesis
Arranging Expression Trees
Synthesis
Outcome. Addition result different from Example 1
B [4-bits]
A[4-bits]
+
C [6-bits]
T [5-bits]
+
Z [7-bits]
2015 Centre for Development of Advanced Computing
Synthesis
Arranging Expression Trees
Example 3
module add ( input [5:0] c,
//6-bits
input [3:0] a, b, //4-bits
output [6:0] z);
//6-bits
assign z= a + b + c;
endmodule
Assume A
is late arriving signal, B and C arrive at same time
2015 Centre for Development of Advanced Computing
Synthesis
Arranging Expression Trees
Synthesis
Outcome. Addition Result same as Example 2
C [6-bits]
B [4-bits]
+
Z= (b+c) +a;
A [4-bits]
T [7-bits]
+
Z [7-bits]
2015 Centre for Development of Advanced Computing
Synthesis
Sharing Common Expressions
If
same expression appears in more than one equation, user
will like to share the expression to achieve low area.
One
option is to assign common expression to a temporary
variable
Example
Z = a + b + c;
Y = a + c + d;
a + c is a common
expression
temp = a + c;
Z = temp + b;
Y = temp + d;
2015 Centre for Development of Advanced Computing
Synthesis
Sharing Common Expressions
Tools
are intelligent enough to identify common expression
and share it automatically if specified in same order.
Example
Z = a + b + c;
Y = a + b + d;
Tool uses common adder for a + b
Z = a + b + c;
Y = d + a + b;
Tool uses different adder for a + b
2015 Centre for Development of Advanced Computing
Synthesis
Sharing Common Expressions
How
to avoid such a problem?
Solution
Using brackets (parenthesis) for common expression
Z = ( a + b ) + c;
Y = d + ( a + b );
Tool uses same adder for a + b
2015 Centre for Development of Advanced Computing
Synthesis
Sharing Common Expressions
In
some cases sharing common expression may result in more
logic resources
Sharing Enabled
Example
If (sel1)
Y<= a + b;
else
Y <= c + d;
Three adders required
a+b
c+d
e+f
Sharing Disable
If (sel2)
Z <= e + f;
else
Z <= a + b;
Two adders required
a + b or c + d
a + b or e + f
2015 Centre for Development of Advanced Computing
Synthesis
Sharing Common Expressions
Synthesis Result when sharing is enabled
C
sel0
Mux
Mux
2015 Centre for Development of Advanced Computing
sel2
Synthesis
Sharing Common Expressions
Constant Propagation
If
outcome of any expression is a constant then no logic
generation takes place.
Constant
value is directly provided.
Example
input [31:0] c;
output [31:0] d
integer a=3;
wire [31:0] b;
assign b = a + 2;
assign d = b + c;
+
Logic not generated
2015 Centre for Development of Advanced Computing
Synthesis
Sharing Common Expressions
Logic resources required by adder
Cin
+
Sum
Carry
Gate count =5
2015 Centre for Development of Advanced Computing
Synthesis
Sharing Common Expressions
Logic resources required by Multiplexer
D0
D1
Mux
S
Gate count = 4
Conclusion : Adders require more area and hence are costly
2015 Centre for Development of Advanced Computing
Synthesis
Sharing Common Expressions
Sharing Complex Operators
B
assign z = x?(a + b) : (c + d);
X
mux
Z
Output is either (a + b) or (c + d) that means at a time only one
addition is required but other is still functioning resulting in
additional area, cost, power requirement
2015 Centre for Development of Advanced Computing
Synthesis
Sharing Common Expressions
Sharing Complex Operators
assign t1 = x ? a : c ;
assign t2 = x ? b : d ;
assign z = t1 + t2 ;
A
Result in Reduction of
Cost
Area
Power Requirements
D
mux
mux
t1
t2
+
Z
2015 Centre for Development of Advanced Computing
Synthesis
Signal and Variable
Signal
/ Non Blocking statements updates at end of process /
always block whereas variables / Blocking Statements updates
immediately
Example 1
process (a, b, c,d)
begin
d <= a;
x <= c xor d;
d <= b;
--- Overrides D<=A --y <= c xor d;
end process;
always @ (*)
begin
d <= a;
x <= c ^ d;
d <= b;
y <= c ^ d;
end
2015 Centre for Development of Advanced Computing
Synthesis
Signal and Variable
Synthesis Result
A
B
XOR
C
Y
2015 Centre for Development of Advanced Computing
Synthesis
Signal and Variable
Example 2
process (a, b, c)
variable d : std_logic;
begin
d := a;
x <= c xor d;
d := b;
y <= c xor d;
end process;
always @ (*)
begin
d = a;
x <= c ^ d;
d = b;
y <= c ^ d;
end;
2015 Centre for Development of Advanced Computing
Synthesis
Signal and Variable
Synthesis Result
A
XOR
XOR
2015 Centre for Development of Advanced Computing
Synthesis
High Performance Coding Techniques
Multiplexer using Tristate
In
Spartan family(XC4000) a 4:1 Multiplexer can be implemented
using single CLB.
If
user wants to implement 16:1 Multiplexer. How many CLBs
are required? five
As
number of CLBs increases, area requirements and delay
increases.
Xilinx
recommend to use internal tristate buffers to implement
multiplexer that requires more than one CLB for implementation
2015 Centre for Development of Advanced Computing
Synthesis
Multiplexer using Tristate
Sel[0]
One Hot Encoded
A
Sel[1]
assign out = sel[0] ? a : 1bz;
assign out = sel[1] ? b : 1bz;
assign out = sel[2] ? c : 1bz;
assign out = sel[3] ? d : 1bz;
assign out = sel[4] ? e : 1bz;
Sel[2]
out
Sel[3]
Sel[4]
2015 Centre for Development of Advanced Computing
Synthesis
Multiplexer using Tristate
Sel=00
Binary Encoded
A
Sel=01
assign out = (sel==2b00) ? a : 1bz;
assign out = (sel==2b01) ? b : 1bz;
assign out = (sel==2b10)? c : 1bz;
assign out = (sel==2b11)? d : 1bz;
out
Sel=10
Sel=11
I->O delay will remain same if
number of inputs are increased
2015 Centre for Development of Advanced Computing
Synthesis
Multiplexer using CLB
assign out = sel[1] ? (sel[0] ? d : c) : (sel[0] ? b : a);
A
B
CLB
S[0]
CLB
C
D
CLB
S[0]
S[1]
I->O delay will increase if number
of inputs are increased
2015 Centre for Development of Advanced Computing
Synthesis
Multiplexer using Tristate
Features
of mux using Tristate Buffers
Selection
lines can be one hot encoded or binary
Number
of inputs can be any number depending upon
number of internal Tristate buffer (5:1 mux, 9:1 mux etc)
CLB
become available for placing other relevant logic.
The
size of multiplexer will have minimal affect on area and
delay.
2015 Centre for Development of Advanced Computing
Synthesis
Multiplexer using Tristate
Issues
with using Tristate Buffers
At
a time if more than one Tristate is ON, the outcome
would be a multiple driver which will increase power
consumption and reduce chip reliability.
Since
use of Tristate is specific to some device family, porting
design becomes difficult.
Designer
has to take care that bus is not kept floating, a week
keeper has to be added.
2015 Centre for Development of Advanced Computing
Synthesis
Pipelining
Pipelining
is approach in which long combinational paths are
broken by placing flip-flops in between.
Pipelining
design tends to increase operating frequency on the
expense of more logic resource (area requirements)
module addition ( input clk, a, b, c, d, e, f, g, h, output reg result);
always @ (posedge clk)
begin
result<= ((a | b) & (c ^ d)) ^ ((e ~^ f) | (g & h));
end
endmodule
2015 Centre for Development of Advanced Computing
Without
Pipelining
Synthesis
Pipelining
6 ns
A
B
AND
C
D
Synthesis Results
OR
XOR
5 ns
XOR
7 ns
E
F
7 ns
XNOR
CLK
OR
G
H
AND
5 ns
F/F
6 ns
Operating Frequency= 1/(20 ns)
2015 Centre for Development of Advanced Computing
Synthesis
Pipelining
module addition ( input clk, a, b, c, d, e, f, g, h, output reg result);
always @ (posedge clk)
begin
temp1<= a | b;
temp2<=c ^ d;
temp3<=e ~^ f;
temp4<=g & h;
temp5<= temp1 & temp2;
temp6<= temp3 | temp4;
result<= temp5^ temp6;
end
endmodule
With Pipelining
2015 Centre for Development of Advanced Computing
Synthesis
Pipelining
6 ns
A
B
OR
AND
C
D
Synthesis Results
F/F
XOR
F/F
F/F
5 ns
XOR
7 ns
E
F
XNOR
7 ns
F/F
OR
G
H
AND
5 ns
F/F
F/F
F/F
CLK
6 ns
Operating Frequency= 1/(7 ns)
2015 Centre for Development of Advanced Computing
Synthesis
Reduce Complex Operation
Arithmetic
and relation operators requires more logic
resources and hence are expensive.
The
approach should be to avoid them in order to achieve
better resource utilization.
Example 1
reg [15:0] count;
Number of 4 input LUTS 60 21504
always @ (posedge clk)
begin
count = count + 1;
If (count> 16'b1010_1001_1011_0110) //Relation Operator
count = 0;
end
2015 Centre for Development of Advanced Computing
Synthesis
Reduce Complex Operation
Example 2
reg [15:0] count;
Number of 4 input LUTS 54 21504
always @ (posedge clk)
begin
count = count + 1;
If (count==16'b1010_1001_1011_0111) //Relation Operator
count = 0;
end
== requires less logic in comparison to >.
Tool does not understand that count will not go more than
1010_1001_1011_0110
2015 Centre for Development of Advanced Computing
Synthesis
Constant Propagation
Constants
are pushed into logic to reduce area requirements
Cool Runner2 CPLDs
Example 1
1
MUX
B
C
2015 Centre for Development of Advanced Computing
Synthesis
Constant Propagation
1
Z
B
2015 Centre for Development of Advanced Computing
Synthesis
Constant Propagation
2015 Centre for Development of Advanced Computing
Synthesis
Constant Propagation
Example 2 : Determine optimized outcome of this mux
0
A
MUX
2015 Centre for Development of Advanced Computing
Synthesis
Constant Propagation
A
Z
X
Y
Instead
of using optimized mux from library, logic is created out
of discrete gates.
To
prevent this we use no boundary optimization attribute so
that timings are not affected.
2015 Centre for Development of Advanced Computing
Synthesis
Register Duplication
Register
Duplication is an approach to reduce fan-out and
improve timing.
If
register duplication is disable and register is driving large
number of loads, this may affect timing requirements of a system.
To
achieve better timing, it is recommended to enable register
duplication which will distribute load and better timing can be
obtained.
2015 Centre for Development of Advanced Computing
Synthesis
Register Duplication
Example 1
Register Duplication disabled
module reg_duplicate1(input clk, din, output [63:0] x);
reg q;
assign x={64{q}};
always @ (posedge clk)
begin
q<=din;
end
endmodule
2015 Centre for Development of Advanced Computing
Synthesis
Register Duplication
Synthesis Results
din
clk
F/F
X[0]
X[1]
Reg A
Reg A drives all 64 loads
X[2]
X[3]
X[4]
Imagine amount of delay
between x[0] and x[63]
X[5]
X[62]
Because of this delay
timing will suffer
X[63]
2015 Centre for Development of Advanced Computing
Synthesis
Register Duplication
Example 2
Register Duplication enabled
module reg_duplicate2(input clk, din, output [63:0] x);
reg q;
assign x={64{q}};
always @ (posedge clk)
begin
q<=din;
end
endmodule
2015 Centre for Development of Advanced Computing
Synthesis
Register Duplication
Synthesis Result 1
din
clk
F/F
Reg A
X[0]
X[1]
X[2]
X[31]
din
clk
F/F
X[32]
Loads are distributed between
Reg A and Reg B
Delay is comparatively less
as compared to Example 1
X[33]
Reg B
X[62]
This is ideal case of Register
duplication
X[63]
2015 Centre for Development of Advanced Computing
Synthesis
Register Duplication
Synthesis Result 2
din
clk
din
F/F
In FPGA number of Flip-Flop
are higher
F/F
clk
din
clk
din
clk
X[0]
F/F
F/F
X[15]
X[32]
Tool will try to place F/F
at each and every load
X[63]
In this case timing results
achieved would be best
2015 Centre for Development of Advanced Computing
Synthesis
Operator in If Statement
If
any signal present in conditional expression is late arriving
signal then it should be moved closer to output.
Example 1
process (A, B, C, D)
begin
if (A + B < 24) then
Z<=C;
else
Z<=D;
end if;
end process;
always @ (*)
begin
If (A + B < 24)
Z<=C;
else
Z<=D;
end
2015 Centre for Development of Advanced Computing
Synthesis
Operator in If Statement
Synthesis Result
C
D
MUX
ADD
COMP
24
2015 Centre for Development of Advanced Computing
Synthesis
Operator in If Statement
Assume
that A is late arriving signal, in that case perform
calculations and then compare with A
Example 2
process (A, B, C, D)
begin
if (A < 24 - B) then
Z<=C;
else
Z<=D;
end if;
end process;
always @ (*)
begin
If (A < 24 - B)
Z<=C;
else
Z<=D;
end
2015 Centre for Development of Advanced Computing
Synthesis
Operator in If Statement
Synthesis Result
C
D
MUX
24
SUB
COMP
2015 Centre for Development of Advanced Computing
Synthesis
Fan-out Control
Larger
the Fan-Out greater is the delay.
If
Fan-Out is beyond limit tool will add buffers which will further
reduce the speed.
The
code has to be modified in such a way that load reduces on
one signal which will result in better timing.
2015 Centre for Development of Advanced Computing
Synthesis
Fan-out Control
Example I
(* max_fanout = "2" *) reg a;
assign e=a ? 1 : 0;
always @ (posedge clk)
a= b & c;
always @ (*)
begin
d= a ^ b;
f= (a ^ c) | (a & b);
end
2015 Centre for Development of Advanced Computing
Synthesis
Fan-out Control
Synthesis Result
Register Duplication: No
2015 Centre for Development of Advanced Computing
Spartan 3
Synthesis
Fan-out Control
Example 2
(* max_fanout = "2" *) reg a, a1;
assign e=a1 ? 1 : 0;
always @ (posedge clk)
begin
a = b & c;
a1= b & c;
end
always @ (*)
begin
d= a ^ b;
f= (a ^ c) | (a1 & b);
end
2015 Centre for Development of Advanced Computing
Synthesis
Fan-out Control
Synthesis Result
Register Duplication: No
Spartan 3
Equivalent register removal: No
2015 Centre for Development of Advanced Computing
Synthesis
Data path Duplication
module path_dup (
input [7:0] a, b,
input [15:0] address,
input control,
output [15:0] out);
parameter [7:0] base1=8h35;
parameter [7:0] base2=8h47;
reg [7:0] c;
reg [15:0] value;
always @ (*)
begin
If (control==1)
temp=a;
else
temp=b;
c= base1-temp;
value= address {8b0, c};
out= value + base2;
end
2015 Centre for Development of Advanced Computing
Synthesis
Data path Duplication
Address
Base1
SUB
MUX
Temp
SUB
Value
Base 2
ADD
Out
Control
Now assume that control is late arriving signal. In order to achieve
timing requirements control has to be moved closer to out.
2015 Centre for Development of Advanced Computing
Synthesis
Data path Duplication
module path_dup (
input [7:0] a, b,
input [15:0] address,
input control,
output out);
parameter [7:0] base1=8h35;
parameter [7:0] base2=8h47;
reg [7:0] c1, c2;
reg [15:0] value1,value2,out1,out2;
always @ (*)
begin
c1= base1- a;
c2= base1- b;
value1= address {8b0, c1};
value2= address {8b0, c2};
out1= value1 + base2;
out2= valu2 + base2;
If (control==1)
out=out1;
else
out=out2;
end
2015 Centre for Development of Advanced Computing
Synthesis
Data path Duplication
Address
Base1
SUB
C1
SUB
Value1
Base 2
ADD
Out1
Address
Base1
SUB
B
C2
Out
MUX
SUB
Value2
Base 2
ADD
Out2
2015 Centre for Development of Advanced Computing
Control
Synthesis
Physical Synthesis and Optimization
with ISE 9.1i
Logic Duplication
If
LUT or F/F drive multiple load and if one or more logic driven
by it is placed far from LUT or F/F.
In
such a case tool duplicates the logic and places it near to the
group of loads and hence helps in achieving timing requirements.
This
phenomenon is called as logic duplication.
2015 Centre for Development of Advanced Computing
Synthesis
Logic Duplication
CLB 1
L1
L3-> L2-> L4 is a critical path
critical path
L2
CLB 2
D
L3
CLB 3
CLB 4
F/F
2015 Centre for Development of Advanced Computing
L4
Synthesis
Logic Duplication
A
L1
CLB 1
L2 is duplicated and placed in CLB4
C
L2
CLB 2
D
L3
CLB 3
L2
CLB 4
Y
F/F
2015 Centre for Development of Advanced Computing
L4
Synthesis
Logic Recombination
If
a critical path travels through multiple LUT.
In
such a case logic can be reassembled using few CLBs(Slices) by
combining LUTs and Mux in more effective way .
This
phenomenon is called as logic Recombination.
2015 Centre for Development of Advanced Computing
Synthesis
Logic Recombination
A
L1
E
L2
CLB 1
CLB 5
CLB 2
D
L3
CLB 3
B
F/F
CLB 4
Q
L4
2015 Centre for Development of Advanced Computing
Synthesis
Logic Recombination
CLB 1
L4
CLB 5
Y
E
L2
L1
CLB 2
D
L3
CLB 3
B
2015 Centre for Development of Advanced Computing
F/F
Synthesis
Basic Element Switching
If
a function is implemented using Mux and LUT present inside a
slice.
In
such a case tool will rearrange function to achieve better
timing results.
This
phenomenon is called as Basic Element Switching.
2015 Centre for Development of Advanced Computing
Synthesis
Basic Element Switching
Assume that L3-> L1-> Y is a critical path
CLB 1
E
L2
L4
L3
L1
CLB 2
D
CLB 5
CLB 3
B
2015 Centre for Development of Advanced Computing
F/F
Synthesis
Basic Element Switching
CLB 1
E
L2
L4
L3
L1
CLB 2
D
CLB 5
CLB 3
B
2015 Centre for Development of Advanced Computing
F/F
Synthesis
Pin Swapping
Each
input pin of LUT has different delay.
Tool(Map)
has ability to swap LUT pin so that critical paths uses
LUT pins that offer faster speed.
This
phenomenon is called as Pin Swapping.
2015 Centre for Development of Advanced Computing
Synthesis
Pin Swapping
Assume that pin 2 and pin 1 have minimum delay
CLB 1
E
L2
L4
L3
2
1 L1
0
CLB 2
D
CLB 5
CLB 3
B
2015 Centre for Development of Advanced Computing
F/F
Synthesis
Pin Swapping
CLB 1
E
L2
L4
L3
2
1 L1
0
CLB 2
D
CLB 5
CLB 3
B
2015 Centre for Development of Advanced Computing
F/F
Synthesis