PHYSICAL DESIGN is the process of transforming a circuit description into
the physical representation for manufacturing
SYNTHESIS:
Synthesis is process of converting RTL to technology specific gate level netlist
Input files required
.lib-timing info of standard cell & macros
.v- RTL code.
SDC- Timing constraints.
UPF- power intent of the design.
Scan config- Scan related info like scan chain length, scan IO, which flops
are to be considered in the scan chains.
RC co-efficient file (tluplus).
LEF/FRAM- abstract view of the cell.
Floorplan DEF- locations of IO ports and macros.
Steps involved in Synthesis
Translation: All the codes and arithmetic operators are converted into Gtech and DW
(Design Ware) components. These are technology independent libraries.
Gtech- contains basic logic gates &flops.
DesignWare- contains complex cells like FIFO, counters.
Optimization: Boolean equation is optimized using SoP or PoS optimization methods.
Technology mapping: Technology independent boolean logic equations are mapped to
technology dependant library logic gates based on design constraints, library of
available technology gates. This produces optimized gate level representation which is
generally represented in Verilog.
These three methods are done internally in the logic synthesis tool(1. Genes,
2.Design compiler) and are not visible to the designer.
3. Import constraints and UPF
SDC – for timing constraints
If the design consists of multiple power domains UPF file is needed
4. Clock gating
Due to high switching activity of clock a lot of dynamic power is consumed. to lower the
dynamic power is clock gating technique is used
clock gating circuit consists of an AND gate in the clock path with one input as enable.
5. DFT (Design for Testing) insertion
DFT circuits are used for testing each and every node in the design.
9. Outputs of Synthesis
netlist
SDC
UPF
ScanDEF- information of scan flops and their connectivity in a scan chain
Checks to be done after sythenthesis or sanity check before floorplan
the RTL and netlist are logically equivalent (LEC/FM)
Floating pins
multi driven inputs
un-driven inputs
un-driven outputs
normal cells in clock path
pin direction mismatch
don’t use cells
Setuptiming
CLP check --- always on buffer is placed or not
Cell profiing
Buffer count
Floor Planning:
A floorplanning is the process of placing blocks/macros in the chip/core area,
.
Floorplan determines the size of die , I/O pin/pad placement and creates power
ground(PG) connections.
Inputs required
1. Gate level netlist
2. LEF,LIB
3. Timing constraints (SDC)
4. Power Intent (UPF / CPF)
5. FP DEF & Scan DEF
Steps involved in floor plan:
— Initialize with Chip & Core Aspect Ratio (AR)
— Initialize with Core Utilization
— Initialize Row Configuration & Cell Orientation
— Provide the Core to Pad/ IO spacing (Core to IO clearance)
— Pins/ Pads Placement
— Macro Placement by Fly-line Analysis
— Macro Placement requirements are also need to consider
— Blockage Management (Placement/ Routing)
1. Initialize Core Aspect Ratio (AR)
AR=Horizontal routing/vertical routing(1-square)
2. Initialize with Core Utilization:
Amount of core area used for cell placement 60-70 remaining for routing
= otal Standard Cell AREA + Macro Area x 100 %
Total Core Area
3. Initialize Row Configuration & Cell Orientation
these rows are individual rows and the row area is utilized by the standard cell
Cell orientation
Fly/flight lines are virtual connections between macros and also macros to I/O pads.
flight lines are of three types.
1. Macro to macro fly lines
2. pin to pin fly lines
3. macro to I/O fly lines
4. Providing space between the Core to Pad/ IO spacing
5.IO placements
6.Add
— End Caps to prevents DRC violations
— Well Taps prevent Latch-up
6.Power Planning
Grid is created to distribute power to all the cells
Width of the metal is available in LEF
• Rings (Vertical and horizontal)
— VDD and VSS Rings are formed around the Core and Macro
• Stripes
— Carries VDD and VSS around the chip
• Rails (Special Route)
— Connect VDD and VSS to the standard cell
7.Macro placement
Guidelines will be provided for macro placement
— Reserve enough room around Macros for IO Routing
— Provide necessary Blockages around the Macro
8. Blockages
— Placement Blockage & Routing Blockage
— Both of the Blockages can again be classified as-
• Hard, Soft and Partial Blockages
— Hard Blockage
• Complete Standard Cell Blockage
—Soft Blockage
• Non-Buffering Blockage
— Partial Blockage
• Partial Standard Cell Blockage and is used to avoid congestion
• We can Block Standard Cells as per the required percentage value
— Keep-out/ Halo
• Halo is similar to Soft Blockage (Terminology in Cadence EDI)
• Its basically a keep-out Macro margin
• Halo respects Macro while other Blockages respect location
i.e., even if Macro is moved Halo also moves along with it
9.Create Power Domain
Checks to be done:
How to qualify Floorplan?
1. Max density
2. Check PG connections (For macros & pre-placed cells only)
3. Check the power connections to all Macros,
4. All the macros should be placed at the boundary
5. Remove all unnecessary placement blockages & routing blockages (which might
be put during floor-plan & pre-placing)
6. Check power connection to power switches
7. Check pin placements
8. Power related short open in design, IR drop
Placement
all the standard cells are placed in the design
• Placement Stages
— Global Placement
— Detail Placement
— Placement Legalization
— In-Place Optimizations
• Global/ Coarse Placement
— approximately place the cells
— Cells are not legally placed and there
can be overlapping
• Detail/ Legal Placement
— Cells have legalized locations To avoid cell overlapping
• Placement Legalization
— Placed Macros are legally oriented with Standard Cell Rows
• In-Place Optimizations
— Scan Chain Reordering
Checks:
1.placement congestion
2.timing checks
3.dont use don’t touch
4.max trans and max cap
5.cell profiling
6.CLP check
7.High fan out
8.Secondary PG connection
CTS:
So far we used ideal clock in cts physical clock tree structure will be built between clock
source to sink
Clock should get distributed evenly for all elements in a design
Goal:
Meet the clock tree DRC.
Max. Transition.
Max. Capacitance.
Max. Fanout.
Minimal skew.
Minimum insertion delay.
These details were present in lib file
Checks to be done before CTS
1. Check legality.
2. verify PG connections.
3. Timing QoR (setup should be under control).
4.Timing DRVs.- max tran, max cap, max fanout
5.Conjestion hotspot
6. Check & qualify don’t_touch, don’t size attributes on clock components
Clock buffer and clock inverter are used to maintain 50% of duty cycle
several structure for clock tree:
H-Tree
X-Tree
Multi level clock tree
Fish bone
Before CTS all Clock Pins are driven by a single Clock Source
After CTS the buffer tree is built to balance the loads and minimize the skew
To meet the ID(Insertion delay) value TAP POINTS CAN BE INCREASED/DECREASED
we can use transport buffer/INVERTER or higher metal
Htree minimize the skew rate or NDR (DWDS)
Many clock buffers are added, congestion may increase, this will cause setup,hold
violation
. Set Up Fixing:
i. Upsizing the cells (increase the drive strength) in data path.
ii. We can reduce the buffers from datapath .
iii. We can replace buffers with two inverters with some distance this will adjust the
delay
iv. LVT cells
v. Clock pushing(in capture path) by adding buffer in clock path
Hold Fixing:
It is well understood hold time will be large if data path has more delay. So we have to
add more delays in data path.
i. Downsizing the cells (decrease the drive strength) in data path.
ii. By adding buffers/Inverter pairs/delay cells to the data path.
iii. By increasing the wire load model, we can also fix the hold violation.
WLM contains the details of Wire resistance and capacitance
Transition violation(it occur only in input pins)
signal takes too long transiting from one logic to another, than a transition violation is
caused. The Trans violation can be because of node resistance and capacitance.
i. By upsizing the driver cell.
ii. reducing long routed net.
iii. By adding Buffers.
Cap violation
The capacitance on a node is a combination of the fan-out of the output pin and
capacitance of the net. This check ensures that the device does not drive more
capacitance than the device is characterized for.
i. The violation can be removed by increasing the drive strength of the cell.
By buffering the some of the fan-out paths to reduce the capacitance seen by the output
pin.
Max fanout(it occur only in output pin)
Routing
Checklist Before Routing
Placement completed
CTS completed
Power and ground nets routed
Estimated congestion - acceptable
Estimated timing – acceptable (~0 ns slack)
Estimated max cap/trans – no violations
Different Types of Delays in ASIC or VLSI design
Source Delay/Latency
Network Delay/Latency
Insertion Delay
Transition Delay/Slew: Rise time, fall time
Path Delay
Net delay, wire delay, interconnect delay
Propagation Delay
Phase Delay
Cell Delay
Intrinsic Delay
Extrinsic Delay
Input Delay
Output Delay
Exit Delay
Latency (Pre/post CTS)
Uncertainty (Pre/Post CTS)
Unateness: Positive unateness, negative unateness
Jitter: PLL jitter, clock jitter
Gate delay
Transistors within a gate take a finite time to switch. This means that
a change on the input of a gate takes a finite time to cause a change on the
output.[Magma]
Gate delay =function of(i/p transition time, Cnet+Cpin).
Cell delay is also same as Gate delay.
Source Delay (or Source Latency)
It is known as source latency also. It is defined as "the delay from the
clock origin point to the clock definition point in the design".
Delay from clock source to beginning of clock tree (i.e. clock
definition point).
The time a clock signal takes to propagate from its ideal waveform
origin point to the clock definition point in the design.
Network Delay(latency)
It is also known as Insertion delay or Network latency. It is defined as
"the delay from the clock definition point to the clock pin of the register".
The time clock signal (rise or fall) takes to propagate from the clock
definition point to a register clock pin.
Insertion delay
The delay from the clock definition point to the clock pin of the
register.
Transition delay
It is also known as "Slew". It is defined as the time taken to change
the state of the signal. Time taken for the transition from logic 0 to logic 1
and vice versa . or Time taken by the input signal to rise from 10%(20%) to
the 90%(80%) and vice versa.
Transition is the time it takes for the pin to change state.
Slew
Rate of change of logic.See Transition delay.
Slew rate is the speed of transition measured in volt / ns.
Rise Time
Rise time is the difference between the time when the signal crosses
a low threshold to the time when the signal crosses the high threshold. It
can be absolute or percent.
Low and high thresholds are fixed voltage levels around the mid
voltage level or it can be either 10% and 90% respectively or 20% and 80%
respectively. The percent levels are converted to absolute voltage levels at
the time of measurement by calculating percentages from the difference
between the starting voltage level and the final settled voltage level.
Fall Time
Fall time is the difference between the time when the signal crosses
a high threshold to the time when the signal crosses the low threshold.
The low and high thresholds are fixed voltage levels around the mid
voltage level or it can be either 10% and 90% respectively or 20% and 80%
respectively. The percent levels are converted to absolute voltage levels at
the time of measurement by calculating percentages from the difference
between the starting voltage level and the final settled voltage level.
For an ideal square wave with 50% duty cycle, the rise time will be
0.For a symmetric triangular wave, this is reduced to just 50%.
Click here to see waveform.
Click here to see more info.
The rise/fall definition is set on the meter to 10% and 90% based on
the linear power in Watts. These points translate into the -10 dB and -0.5
dB points in log mode (10 log 0.1) and (10 log 0.9). The rise/fall time values
of 10% and 90% are calculated based on an algorithm, which looks at the
mean power above and below the 50% points of the rise/fall times. Click
here to see more.
Path delay
Path delay is also known as pin to pin delay. It is the delay from the
input pin of the cell to the output pin of the cell.
Net Delay (or wire delay)
The difference between the time a signal is first applied to the net
and the time it reaches other devices connected to that net.
It is due to the finite resistance and capacitance of the net.It is also
known as wire delay.
Wire delay =fn(Rnet , Cnet+Cpin)
Propagation delay
For any gate it is measured between 50% of input transition to the
corresponding 50% of output transition.
This is the time required for a signal to propagate through a gate or
net. For gates it is the time it takes for a event at the gate input to affect the
gate output.
For net it is the delay between the time a signal is first applied to the
net and the time it reaches other devices connected to that net.
It is taken as the average of rise time and fall time i.e. Tpd=
(Tphl+Tplh)/2.
Phase delay
Same as insertion delay
Cell delay
For any gate it is measured between 50% of input transition to the
corresponding 50% of output transition.
Intrinsic delay
Intrinsic delay is the delay internal to the gate. Input pin of the cell to
output pin of the cell.
It is defined as the delay between an input and output pair of a cell,
when a near zero slew is applied to the input pin and the output does not
see any load condition.It is predominantly caused by the internal
capacitance associated with its transistor.
This delay is largely independent of the size of the transistors forming
the gate because increasing size of transistors increase internal capacitors.
Extrinsic delay
Same as wire delay, net delay, interconnect delay, flight time.
Extrinsic delay is the delay effect that associated to with interconnect.
output pin of the cell to the input pin of the next cell.
Input delay
Input delay is the time at which the data arrives at the input pin of the
block from external circuit with respect to reference clock.
Output delay
Output delay is time required by the external circuit before which the
data has to arrive at the output pin of the block with respect to reference
clock.
Exit delay
It is defined as the delay in the longest path (critical path) between
clock pad input and an output. It determines the maximum operating
frequency of the design.
Latency (pre/post cts)
Latency is the summation of the Source latency and the Network
latency. Pre CTS estimated latency will be considered during the synthesis
and after CTS propagated latency is considered.
Uncertainty (pre/post cts)
Uncertainty is the amount of skew and the variation in the arrival
clock edge. Pre CTS uncertainty is clock skew and clock Jitter. After CTS
we can have some margin of skew + Jitter.
Unateness
A function is said to be unate if the rise transition on the positive
unate input variable causes the ouput to rise or no change and vice versa.
Negative unateness means cell output logic is inverted version of
input logic. eg. In inverter having input A and output Y, Y is -ve unate w.r.to
A. Positive unate means cell output logic is same as that of input.
These +ve ad -ve unateness are constraints defined in library file and
are defined for output pin w.r.to some input pin.
A clock signal is positive unate if a rising edge at the clock source
can only cause a rising edge at the register clock pin, and a falling edge at
the clock source can only cause a falling edge at the register clock pin.
A clock signal is negative unate if a rising edge at the clock source
can only cause a falling edge at the register clock pin, and a falling edge at
the clock source can only cause a rising edge at the register clock pin. In
other words, the clock signal is inverted.
A clock signal is not unate if the clock sense is ambiguous as a result
of non-unate timing arcs in the clock path. For example, a clock that passes
through an XOR gate is not unate because there are nonunate arcs in the
gate. The clock sense could be either positive or negative, depending on
the state of the other input to the XOR gate.
Jitter
The short-term variations of a signal with respect to its ideal position
in time.
Jitter is the variation of the clock period from edge to edge. It can
varry +/- jitter value.
From cycle to cycle the period and duty cycle can change slightly due
to the clock generation circuitry. This can be modeled by adding uncertainty
regions around the rising and falling edges of the clock waveform.
Sources of Jitter
Common sources of jitter include:
Internal circuitry of the phase-locked loop (PLL)
Random thermal noise from a crystal
Other resonating devices
Random mechanical noise from crystal vibration
Signal transmitters
Traces and cables
Connectors
Receivers
Click here to read more about jitter from Altera.
Click here to read what wiki says about jitter.
Skew
The difference in the arrival of clock signal at the clock pin of different
flops.
Two types of skews are defined: Local skew and Global skew.
Local skew
The difference in the arrival of clock signal at the clock pin of related
flops.
Global skew
The difference in the arrival of clock signal at the clock pin of non
related flops.
Skew can be positive or negative.
When data and clock are routed in same direction then it is Positive
skew.
When data and clock are routed in opposite then it is negative skew.
Recovery Time
Recovery specifies the minimum time that an asynchronous control
input pin must be held stable after being de-asserted and before the next
clock (active-edge) transition.
Recovery time specifies the time the inactive edge of the
asynchronous signal has to arrive before the closing edge of the clock.
Recovery time is the minimum length of time an asynchronous
control signal (eg.preset) must be stable before the next active clock edge.
The recovery slack time calculation is similar to the clock setup slack time
calculation, but it applies asynchronous control signals.
Equation 1:
Recovery Slack Time = Data Required Time – Data Arrival Time
Data Arrival Time = Launch Edge + Clock Network Delay to Source
Register + Tclkq+ Register to Register Delay
Data Required Time = Latch Edge + Clock Network Delay to
Destination Register =Tsetup
If the asynchronous control is not registered, equations shown in Equation
2 is used to calculate the recovery slack time.
Equation 2:
Recovery Slack Time = Data Required Time – Data Arrival Time
Data Arrival Time = Launch Edge + Maximum Input Delay + Port to
Register Delay
Data Required Time = Latch Edge + Clock Network Delay to
Destination Register Delay+Tsetup
If the asynchronous reset signal is from a port (device I/O), you must
make an Input Maximum Delay assignment to the asynchronous reset pin
to perform recovery analysis on that path.
Removal Time
Removal specifies the minimum time that an asynchronous control
input pin must be held stable before being de-asserted and after the
previous clock (active-edge) transition.
Removal time specifies the length of time the active phase of the
asynchronous signal has to be held after the closing edge of clock.
Removal time is the minimum length of time an asynchronous control
signal must be stable after the active clock edge. Calculation is similar to
the clock hold slack calculation, but it applies asynchronous control signals.
If the asynchronous control is registered, equations shown in Equation 3 is
used to calculate the removal slack time.
If the recovery or removal minimum time requirement is violated, the
output of the sequential cell becomes uncertain. The uncertainty can be
caused by the value set by the resetbar signal or the value clocked into the
sequential cell from the data input.
Equation 3
Removal Slack Time = Data Arrival Time – Data Required Time
Data Arrival Time = Launch Edge + Clock Network Delay to Source
Register + Tclkq of Source Register + Register to Register Delay
Data Required Time = Latch Edge + Clock Network Delay to
Destination Register + Thold
If the asynchronous control is not registered, equations shown in
Equation 4 is used to calculate the removal slack time.
Equation 4
Removal Slack Time = Data Arrival Time – Data Required Time
Data Arrival Time = Launch Edge + Input Minimum Delay of Pin +
Minimum Pin to Register Delay
Data Required Time = Latch Edge + Clock Network Delay to
Destination Register +Thold
If the asynchronous reset signal is from a device pin, you must
specify the Input Minimum Delay constraint to the asynchronous reset pin
to perform a removal analysis on this path.
For more detail about recovery and removal time click here.