0% found this document useful (0 votes)
219 views96 pages

1.signoff Semi Blog

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
219 views96 pages

1.signoff Semi Blog

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 96

IC Design & Flow Overview

A System on Chip (SoC) is an integrated circuit that integrates all components of an electronic
systems. It may contain digital, analog, mixed-signal, and radio-frequency modules—all on a
single substrate. SoCs are very common in the mobile computing market because of their low
power-consumption
SoC designs usually consume less power and have a lower cost and higher reliability than the
multi-chip systems that they replace. And with fewer packages in the system, assembly costs
are reduced as well.
Advantages of SoC
1. Compact system size (Chip size is very less compared to board size)
2. Less power consumption (Less components, less IOs, less passive components helps to
reduce power)
3. High performance
4. Less system cost
PCB – SoC
SiP (System in Package):
Advantages
1. Developing cost will be less
2. Faster turn around time (Development time will be less)
3. Different technology chips can be mounted in same package
4. Yield will be increased, as individual chip size are small

General CHIP design flow


All semicon giants follow a robust SoC/IC design flow, to get reduce the TTM in this
competitive market. Development cost of any SoC/IC is very high & hence every one targets
for first pass silicon. A successful chip is not enough, it has to meet many criteria like Power,
Performance, Area, Schedule (PPAS), Yield, Cost. All these can be achieved with systematic,
flawless flow.
A general IC design flow is shown in the figure.
Detailed IC Design Flow
CMOS Basics & Process Overview
Why CMOS?
1. Output of all CMOS cells will be very close to rail-rail (may not be in case of Pass
Transistor)
2. With constant input to any cell, power dissipation is only due to leakage currents. Power
dissipation increase if activity factor is more (Short circuit current + charging &
discharging of load)
3. Analog, RF & Memory can be integrated in single chip in case of CMOS process
4. High impedance inputs with voltage as a trigger. Very little current passes through the
input.
5. CMOS provides full swing of VDD through PMOS & GND through NMOS
6. Temperature stability is more (-60 C – 130 C)
7. Noise immunity is high (switching happens close to VDD/2) (Higher noise margins)
Before deep-diving into CMOS basic, let’s discuss & understand basis of RC. Thorough
understating of RC is very much required in VLSI domain and it helps a lot in complex
analyses. I would say, if we understand all basic concepts very well, we can analyze & solve
any problem very easily.
RC (Why RC is important?)
Definitely we cannot build a complete & accurate model of CMOS using R & C. Hundreds of
parameters are required to model a CMOS accurately (LEVEL 54 BSIM4.0 Model). But for a
digital design/PD engineer, an abstract level understanding of MOSFET will definitely help in
understanding & analyzing complex circuits & issues. So I would recommend you to
understand the RC model of CMOS thoroughly.
Resistance effect (Width of MOSFET)
Consider RC circuit with R = 100k-Ohm & C = 10pF. Current I, is inversely proportional to
Resistance R. If R is high, then current will be less & hence more time is required to charge
the load capacitor. Refer below circuit, V(n001) [green] is supply & V(n002) [blue] is voltage
across capacitor. Observe the voltage graph across capacitor, charging time is high because of
high resistance.
Lets reduce Resistance R to 10k Ohm. As R is reduced, I increases & hence charging / dis-
charging time of Capacitor C are reduced. Refer below circuit.

Now lets extend this understanding to MOSFET. Drain current in MOSFET, Id =


u*Cox*W*sq(Vgs-Vt) / 2L & On-resistance of MOSFET depends on W(width) & L(Length).
Higher the value of W, lower the resistance & current carrying capability of MOSFET is more.
“Hence increasing the width of MOSFET reduce the charging & dis-charging time of the
load capacitor (charging & dis-charging time are nothing but rise & fall time)” . In digital
circuit implementation (Synthesis & PnR), width can varied by using different drive
strength cells. (like BUF-D1 (w=1u), BUF-D2 (w=2u), BUF-D3 (w=3u)…)”
“Current through MOSFET can be increased by increasing width of the device & hence
reducing the rise time / fall time (Charge / Di-charge time of load capacitor)”
Capacitance effect (Load Cap / Fan-Out)
RC Model of a Buffer
CMOS Basics

By Reza Mirhosseini – originally uploaded to en.wikipedia (file log), Public Domain, Link
Why PMOS is slower than NMOS?
Most of the answers would stop at “Mobility of hole is lesser than mobility of electrons in
semiconductor”. Obviously next question would be “Why?”
To understand this concept, let’s go back to basics of energy band diagrams.

Conduction electrons (free-electrons) travel in the conduction band and valence electrons
(holes) travel in the valence band. In an applied electric field, valence electrons cannot move
as freely as the free electrons because their movement is restricted. The mobility of a particle
in a semiconductor is larger if its effective mass is smaller and the time between scattering
events is larger.
Hence mobility of electrons is more than of holes. “Now we understand why PMOS are slower
than NMOS”
In short, ‘Hole’s movement in valence band requires more energy (voltage) compared to
free electron in conduction band’
Because of this mobility mis-match, beta ratio comes into picture while design standard logic
cells. And we also require special cells for clock network, whose rise & fall time are very well
balanced. Why clock cells need this balance? We will discuss on this in CTS training session.
PROCESS OVERVIEW
Most of the semicon giants do lot of research in process technology. Increasing need of ultra
low power, high performance, small form factor & low cost ICs is making Fabs/Companies to
put more money on the research. Traditional planar bulk process has many issues with further
scaling (like leakage & short channel effects). As of now there are two most popular solutions
for this problem. One is 3D finfet technology & FD-SOI. We will briefly discuss about these
three process technologies.
Different types of process technologies;
1. Bulk CMOS
2. FD-SOI (Fully Depleted Silicon on Insulator)
3. FinFET
BULK CMOS
It is a planar process. Cross sectional view of devices in Bulk CMOS are shown below:

By Reza Mirhosseini – originally uploaded to en.wikipedia (file log), Public Domain, Link
One of the mail limitations of bulk planar transistors is channel area underneath the gate is deep
and much of the channel is far away from the gate to be well-controlled. The result is higher
leakage power (static/stand-by power) and gate is never truly turned off.
FD-SOI (UTBB-FD-SOI : Ultra thin body & buried oxide, Fully Depleted Silicon On
Insulator )
Key advantages of Fully depleted SOI:
1. Better electro-static control of the channel.
2. No doping required
3. Limited short channel effects, compared to CMOS Bulk
4. Minimum junction capacitance and diode leakage
5. Low leakage power

FinFET
Scaling became very difficult in planar Bulk CMOS (less than 28 nm). Below video explains
the structure of FiFET deivce. This form of gate structure provides improved electrical control
over the channel conduction and it helps reduce leakage current levels and overcomes some
other short-channel effects.
Key advantages of FiFET:
1. Low power, hence allowing high integration levels.
2. FinFETs operate at a lower voltage as a result of their lower threshold voltage.
3. Because of reduced short-channel effects, further device shrinking id possible.
4. Very less leakage current
5. Improved operating speed
Process modelling
Process modelling is predict geometries and material properties of the wafer structures and
semiconductor devices as they result from the manufacturing process. We will be discussing
the use of process models (spice models) for timing of the entire IC. Spice models of the
process, are the source for any further timing model of the devices/cells/Memories/IPs/Sub
Systems & Interconnect models are also used during the simulations (which may generate
timing models)

Temperature Inversion
Inter Connect Variations & Parasitic Corners

Five Parasitic corners are:


1. Cbest – minimum capacitance, minimum delay (Hold analyses)
2. Cworst – maximum capacitance, maximum delay (Setup analyses)
3. RCbest – minimum RC product (Long interconnects)
4. RCworst – maximum RC product (Long interconnects)
5. Typical – nominal values of RC
Standard Cell Library
Standard Cell Architecture
 Standard cells are designed based on power, area and performance.
 First step is cell architecture. Cell architecture is all about deciding cell height based on
pitch & library requirements. We have to first decide the track, pitch, β ratio, possible
PMOS width and NMOS width.
 Track : Track is generally used as a unit to define the height of the std cell.Track can
be related to lanes e.g. like we say 4 lane road, implies 4 vehicles can run in parallel.
Similarly, 9 track library implies 9 routing tracks are available for routing 9 wires in
parallel with minimum pitch.
 Pitch : The distance between two tracks is called as pitch.
 Via : Vias are used to connect two different metal layers as shown in Fig. 1(a). In
Fig.1(b), we are connecting M1 and M2 using a Via. We don’t make tracks with
minimum spacing as we will get DRC error if there is any via overhang.

Fig. 1(a) Via connecting metal 1 and metal 2.


Fig. 1(b) Pitch calculation including via overhang
 Let us see how to calculate the standard cell height, pitch, size of PMOS and NMOS
for a 9 track library.
 Let the metal width be 4 units, minimum metal to metal spacing is 3 units and
via overhang be 2.
 Pitch = 2[1/2(metal width)+Via overhang]+ metal-to-metal spacing. Using this
formula, Pitch = 11 units.
 Standard cell height = Pitch * (N-1) where N represents the number of tracks.
This sums to 88 units.
 In a layout, the cells will be arranged one above the other, in such away that
they can share one common VDD and VSS. Fig. 2 depicts two cells(can be any
cells) abutted in such a way that they share the same VDD.
Fig. 2 Calculation of Standard cell height

 Let us take the β ratio as 1.5. Hence, Wp=1.5Wn. Below given are the variables used
for calculating the standard cell height :
 p = Poly overhang, here it is 2 units.
 x = Minimum well to well spacing required between the two cells, here it is 12
units.
 y = We need to leave half of the space between corresponding layer to avoid
half DRC violation between two different cells abutted on VDD and VSS. This
comes to 1.5 units.
 Wp = Width of PMOS.
 Wn = Width of NMOS.
 Height of the standard cell , Wp+Wn+x+2y+2p = 88 units.
 Using this formula, Wn is calculated as 27.6 units and Wp is calculated as 41.4 units.
Similarly we can calculate Wn and Wp values for different libraries.
 If we compare 7T and 11T, 11T is faster and will give better performance because the
area for 11T is more so that we can place higher drive strength transistors in it.

 Using 11T library we can achieve higher utilizations.


 11T library are used for better performance.
 7T library are used for higher density & low power.
Cells in generic library
1. Basic gates (AND, OR, NAND, NOR, INV, EXOR, EXNOR)
2. MUX
3. HA, FA
4. Special cells (Fillers, Tap cells, End Cap, De Caps)
5. Tie Cells
6. Metal Eco-able cells
7. AOI
8. OAI
9. Boolean function cells
10. Flops (Normal D flip flop, Scan-able flop with set / reset)
11. Clock gate
Power management cells
Isolation cell
 Used to isolate the output of OFF domain.
 Allowing the floating output value of OFF domain (in off state) to be connected with
the ON domain will result in
 Flow of crowbar current, resulting in the increase of power consumption.
 Improper functioning of ON domain which may cause meta-stability.
 Also known as clamp cells, because they are used to clamp the intermediate voltage
levels to either 0 or 1.
 Isolation cells are designed either using OR gate (clamp 1) or AND gate (clamp 0).
 In case of microcontroller, when the processor goes to off mode, we use isolation cells
to isolate the processor core from other modules.
 Isolation cells can be placed either in OFF domain or ON domain.
 When there are multiple fanouts from the OFF domain placing one isolation cell in the
OFF domain will isolate multiple sinks. Power must be provided from always ON
supply/sink domain power supply which is challenging.
 Isolation cells if placed in ON domain don’t require secondary power supply.

Fig2: Isolation cell


Level Shifter
 Level shifter cell is used to shift a signal voltage from one voltage domain to another.
 These cells are required when the chip is operating at multiple voltage domains.
 The difference in voltage range may cause unreliable functioning of destination domain
hence, level shifters cells are inserted in the voltage domain crossing.

Fig3: Level shifter


Power gate / switch
 The factors which are to be considered while designing power switch network are:
 When they are ON, their Vt will be so low whereas when they are OFF, their Vt will
be so high.
 Power gates are designed with the help of multi threshold CMOS.
 Power gating is a technique used in IC designs to reduce power consumption by
shutting off the power to blocks of the circuit that are not in use.
 Power gates are used for power gating.
 Rush current: Rush current is the current drawn by a component during its initial
power up to charge its internal capacitors. When a power domain is powered up
from shutdown all the capacitors in the power domain starts to charge. The
amount of current drawn will be huge as all the capacitors start to charge which
will result in sudden rush of current. This rush current can damage the power
switch network. For this we usually design the power switch network in daisy
chain fashion.
 Leakage current: The number of power switches used to implement power
switch network should be optimal because if more power switches are there
leakage current will be more.
 Ramp up time: It is the time required to power up an off component so the power
switch network should be designed in such a way that the ramp up time is less.
It can be achieved by increasing the number of power switches.
Retention flop
 Retention flops are always ON flops which are used to retain the data when a power
domain goes to OFF mode.
 Secondary power supply is used to power these flops.
 A retention flop is a combination of regular flop and state saving latch.
Special cells
Tap cells
 Tap cells are used to provide substrate connection.
 They are used to avoid latch-up.
 They connect n-well to VDD and p-sub to VSS.
 They are inserted in layout at regular intervals based on tap rules (tap to gate distance)
defined in the technology DRC file.
Filler cells
 Filler cells are used to provide rail continuity, thereby reducing the DRC violations
created by the base.
 Filler cells are designed in such a way that they contain n-well and p substrate.
Metal eco-able cells
 The filler cells which are converted to attain any functionality are called metal eco-able
cells.
 The base layers of both filler cells and metal eco-able cells are same. Some extra metal
connections will be added in metal eco-able cells to attain the functionality.
 Sizes of these cells are more when compared to normal cells of same functionality.
 For example, consider a design having hold violation after the fabrication. One way to
overcome the violation is to delay the data-path. In this case we can convert metal eco-
able cells to buffer for the delay. (generally done during re-spin of chip).
Antenna diode
 During fabrication stray charges get accumulated in metal layers. The
gate gets ruptured when the amount of these charges are more than threshold. This
effect is called antenna effect. The threshold is decided by metal layer area to gate
area ratio.
 To overcome the antenna effect we use antenna diodes.
 Zener diodes will be connected to the metal layers to remove the excess charges.

Fig4: Antenna diode


 Another way to overcome antenna effect is to add jumpers. Use higher metal layers for
connection.

Fig5: Jumper
De cap cells (Decoupling capacitor cells)
 De cap cells are capacitors added in design between power and ground rails.
 When there is drop in power rail, these cells act like a battery and maintain the voltage
across rails.
 These cells aids IR drop issue and removes glitches in power.
 In a design most of the power consumption is done by clock circuits. Assume that all
the clock blocks are clustered in an area, then they will consume more power, i.e. they
drew more current which will increase IR drop. In this case de cap cells can be used.
End cap cell
 End cap cells are added near the end of rows to terminate the rows properly.
 The n-wells of end cap cells are properly terminated within the cell.
Tie cell
 Tie cells are used to avoid direct gate connection to the power or ground network
thereby protecting the cell from damage.
 In your design, some cell inputs may require a logic 0 or logic 1 value. Instead of
connecting these to the VDD/VSS rails/rings, you connect them to special cells
available in your library called TIE cells.
 In tie high cell, nmos acts as diode connected and gives logic 0 to the gate of pmos, so
we will get logic 1 as output whereas in tie low cell, pmos act as diode connected and
gives logic 1 to the gate of nmos, so we will get logic 0 as output.

Fig6: Tie cell


Spare cell
 Spare cells are normal standard cells but they act as redundant cells as they are evenly
distributed on the chip in anticipation of future ECO i.e, after the tape out.
 After the tape out, sometimes we may have to make some changes to the design to
resolve a bug. In these cases we use the pre existing spare cells in the design.
 If we carry out the design changes with minimal layer changes, it will save a lot of cost
from fabrication point of view as each mask layer has significant cost of its own.
 Spare cell inputs are connected to VDD/GND when they are placed in the design and
their outputs are left floating.
 If they are required to be used, then their inputs are disconnected from VDD/GND and
connected to functional logic in ECO mode.

Fig7: Spare Cell


Characterization
 Characterization is the generation of .lib files, done with respect to PVT corners.
 Typically, characterization is done for six different loads and six different
transitions(slew)
 Models used to generate .lib files are NLDM and CCS. CCS is more accurate when
compared to NLDM.
PVT, RC Variation & OCV
PVT:
PVT is abbreviation for Process, Voltage and Temperature. In order to make our chip to work
in all possible conditions, like it should work in Siachen Glacier at -40°C and also in Sahara
Desert at 60°C, we simulate it at different corners of process, voltage and temperature which
IC may face after fabrication. These conditions are called as corners. All these three parameters
affect the delay of the cell. We will see each and every parameter and its effect on delay in
detail.
Process:
Process variation is the deviation in attributes of transistor during the fabrication.
During manufacturing a die, the area at the centre and that at the boundary will have different
process variation. This happens because layers which will be getting fabricated can not be
uniform all over the die. As we go away from the centre of the die, layers can differ in their
sizes.
Process variation is gradual. It cannot be abrupt.
Process variation is different for different technologies but is more dominant in
lower node technologies (<65nm).
Below are few important factors which can cause process variation;
1. Wavelength of the UV light
2. Manufacturing defects
The effects of process variation are listed below;
1. Oxide thickness variation
2. Dopant and mobility fluctuation
3. Transistor width, length etc.
4. RC Variation
These variations will cause the parameters like threshold voltage to change its value from
expected. Threshold voltage depends on oxide thickness, source-to-body voltage and implant
impurities. Consider the drain current equation for NMOS;
ID = (1/2)μnCox (W/L)(VGS – VTh)2
As we are talking about process variation, it deals with physical properties of MOSFET. So,
current flowing through the channel directly depends upon mobility (μn), oxide capacitance
Cox (and hence thickness of oxide i.e. tox) and ratio of width to length.
Any of these parameters change, it will result in changing the current. In other words, it will
affect the delay of the circuit. Delay decreases with increase in current.
The relation between process and delay can be better understood with the following curve
shown in Figure 1.
From this relation, we say that delay is more for slow process MOSFETs and it is less for fast
process MOSFETs.
There are separate model files for every process corner.

Figure 1: Process Vs Delay Graph


Voltage:
Now a days, supply voltage for a chip is very less. Lets say chip is operating at 1V. So there
are chances that at certain instance of time this voltage may vary. It can go to 1.1V or 0.9V. To
take care of this scenerio, we consider voltage variation.
There are multiple reasons for voltage variation. These are discussed below.
The important reason for supply voltage fluctuations is IR drop. IR drop is caused by the current
flow over the parasitic resistance of the power grid. IR drop reduces the supply voltage from
the required value.
The second important reason for voltage variation is supply noise caused by parasitic
inductance in combination with resistance and capacitance. The current through parasitic
inductance causes the voltage bounce. Both these effects together can not only lead to voltage
drops but also voltage overshoot.
Supply voltage that any chip works on is given externally. It can come from DC source or some
voltage regulator. Voltage regulator will not give same voltage over a period of time. It can go
above or below the expected voltage and hence it will cause current to change making the
circuit slower or faster than earlier.
Because of all these factors, we have to consider the voltage variation. Figure 2 shows the
relation between supply voltage and delay.

Figure 2: Voltage Vs Delay Graph


Temperature:
The temperature variation is with respect to junction and not ambient temperature. The
temperature at the junction inside the chip can vary within a big range and that’s why
temperature variation need to be considered. Figure 3 shows the variation of delay with respect
to temperature. Delay of a cell increases with increase in temperature. But this is not true for
all technology nodes. For deep sub-micron technologies this behaviour is contrary. This
phenomenon is called as temperature inversion.

Figure 3: Temperature Vs Delay Graph


Temperature inversion: The delay depends on the output capacitance and ID current (directly
proportional to Cout and inversely proportional to ID). When the temperature increases, delay
also increases (due to the variation in carrier concentration and mobility). But when
temperature decreases, delay variation shows different characteristics for submicron
technologies. For technology nodes below 65nm, the delay will increase with decrease in
temperature and it will be maximum at -40°C. This phenomenon is known as “temperature
inversion”.
Why Temperature inversion happens?
As temperature increases, mobility and threshold voltage start decreasing. The delay is
inversely proportional to the mobility and directly proportional to the threshold voltage.
So the resultant effect from both mobility and threshold voltage decides the value of delay.
Consider the current equation of a MOSFET for better understanding;
ID = (1/2)μnCox (W/L)(VGS – VTh)2
In the higher technology node, where the supply voltage is very high, the effect of V Th is very
low as (VGS – VTh) value is large. Hence mobility plays major role in deciding current. So at
higher technology nodes, when the temperature increases mobility decreases and as a result the
delay will increase.
At the lower technology node (specifically, less than 65nm), the supply voltage is very low, so
the (VGS – VTh) difference is small and the square of this value is very small resulting reduced
ID current, which increases delay at lower temperature. Where at other end above 65nm delay
decreases at lower temperature.
RC Variation:
RC variation is also considered as corners for the setup and hold checks. RC variation can
happen because of fabrication process and the width of metal layer can vary from the desired
one. To read more on RC variation, check our blog on CMOS basics and process overview.
Critical corners for Setup and Hold check
We always check our chip to work in worst scenarios. We should be very pessimistic about
setup and hold checks. So consider worst case scenarios.
Setup violation can be caused if data is coming very slow. So the condition when process is
slow, voltage is minimum and temperature is maximum is the worst case for setup check. Also
because of temperature inversion at lower technology node, delay will increase as temperature
decrease. Hence lowest temperature results in more delay. It is not compulsory that the delay
at lowest temperature is always less than delay at highest temperature.
Hold violation is caused if data comes faster. So process should be faster, voltage should be
maximum and temperature should be minimum.
Now if setup and hold are checked in worst corners, then the chip should work in every
scenario. Still we check them in typical corners because we need to analyse power
consumption. Refer following table for the worst case scenarios for setup and hold.
Table: Worst Scenarios for setup and hold
On Chip Variations (OCV):
Variations are of two types:
1. Global variations:
These are PVT variations that depend on external factors like Process, Supply Voltage and
Temperature. ICs are fabricated in batches and hence exhibit die to die variations. Some exhibit
strong process (fast switching) and weak process (slow switching). These are known as inter-
chip variations.
2. Local variations:
Local variations are also variations in PVT, but these are intra-chip variations known as OCV.
Process:
All the transistors in a chip cannot be expected to have the same process. There can be
variations in channel length, oxide thickness, doping concentration, metal thickness etc due to
imperfections in manufacturing process like mask print, etching etc.
Voltage:
The supply voltage reaching the power pins will not be the same for all standard cells. The
power network has a finite resistance. Consider two cells, one which is placed closer, and other
placed far. As the interconnect length for the farther cell is more, it has more resistance and
results in a higher IR drop, thereby reducing the supply voltage reaching the cell. As the voltage
is less, this cell has more delay than the cell which is placed closer.
Temperature:
The transistor density within a chip is not uniform. Some regions of the chip have higher
density and higher switching, resulting in a higher power dissipation. Hence the junction
temperature at these regions are higher, forming localized hot spots. This variation in
temperature across the chip can result in different delays.
How do you account these variations? Derates..
As a result of OCV, some cells may be fast or slow than expected. If these variations are not
accounted, results may be pessimistic and can lead to setup or hold violations. In order to model
these, we introduce derates. Timing derates are multiplied with the net delay and cell delay for
the launch and capture clock paths. This is given as say x%. Let us consider a timing derate of
8% and how it is accounted in setup and hold analysis.
Setup analysis:
Setup check is done in worst case. The setup check is more pessimistic when the launch clock
reaches late than the capture clock. Here we multiply the launch path delays with late derate of
1.08 and the capture path delays with an early derate of 0.92.
Hold analysis:
Hold check is done in best case. Hold check is more pessimistic when the launch clock reaches
early than the capture clock. Here we multiply the launch path delays with an early derate of
0.92, and capture path delays with a late derate of 1.08.

Figure 4: Timing analysis with CRPR


Consider the above path from FF1 to FF2. For setup analysis, as per the previous example, a
late derate of 1.08 is applied for the launch path and an early derate of 0.92 is applied for the
capture path. But what is the problem here?
The launch clock path and the capture clock path share a portion of the clock tree (B1, B2) and
then diverge from the common point. This common path delays are multiplied with different
derates (early and late), resulting in different delays. These cells have max delay in launch path
and min delay in capture path. The same cell cannot have different delays at the same time.
This results in additional pessimism which has to be removed. Here comes the need for Clock
Re-convergence Pessimism Removal (CRPR) or Common Path Pessimism Removal (CPPR).
This pessimism value is the difference between the max and min delay at the common clock
path. To reduce pessimism, CRPR is added to required time in setup analysis and subtracted
from required time in hold analysis.
LEF, DEF & LIB
Library Exchange Format (LEF)
The LEF file is the abstract view of cells. It only gives the idea about PR boundary, pin position
and metal layer information of a cell. To get the complete information about the cell, DEF
(Design Exchange Format) file is required. In this 3 sections are defined, i.e. technology, site,
macros. In the technology part layers, design rules, via definitions and metal
capacitance are defined. In the site, site extension is defined and in the macros the information
about cell description, dimension, layout of pins and blockages and capacitance are defined.
For every technology the layer and the via statements are different. So for the layer and via,
the type of the layer (layer may be routing type, master slice or overlap), width/pitch and
spacing, direction, resistance, capacitance, and antenna factor are defined.
Unit Definition
UNITS
DATABASE MICRONS 1000 ;
END UNITS
Values defined in file will be multiplied with UNITS. For example, if spacing is defined as 0.6,
then the actual value will be 600bd (0.6 * 1000).
Manufacturing grid
MANUFACTURING GRID 0.1 ;
This is defined for the geometry alignment, once it is specified, then the cells are placed in
location which is aligned to the manufacturing grid.
Implant Layer definition
This syntax defines the Implant layer in the design. For each layer, name, space and width are
defined. Space and width are the factors that affect the legal cell placement.
Masterslice or Overlap Layer definition
LAYER layer Name
TYPE{MASTERSLICE| OVERLAP} ;
This defines the master slice (non routing) or overlap layers in the design. Master slice layers
are basically polysilicon layers. Whenever the pins of MACROS are present on Polysilicon
these layers are used.
VIA
for signal routers the VIA statement defines via’s. By default, via is using three layers
1. cut layers.
2. Routing
3. Master slice.
The routing and the master slice layers touch the cut layers.
Via Rule Generator
In order to generate the via arrays, via rule generator defines the formulas.VIARULE
GENERATE statement can be used to define the special wiring which is explicitly not defined
in VIARULE statement.
Same-Net Spacing
This rule determines the minimum spacing between geometries in the same net, it is only
defined if the same-net spacing is less than the different net spacing.
SITE
Site specifies the region of the block like PAD and CORE, under this syntax, symmetry is also
defined w.r.t. X, Y or R90 (Rotate by 90°).
Macro (Attributes of Macros are defined)
This syntax defines the detail about Macros like name, PAD detail, class size, location of
endcap cells (like top right, bottom etc.) symmetry, site name, obstruction detail.
Macro Pin Statement
Defines the pins for the Macros. For each macro, Pin statements are required (all I/O pins,
VDD , VSS).
Following list of pins are required
 Power and ground pins
 Input and Output Pins, inout and netlist pins.
 Must Join pins
Must Join pin
This specifies the name of the pins to be connected together.
Macro Obstruction statement
The OBS defines the group of obstruction on macros, normally this blocks the routing but in
case of obstruction on pin it allows the routing.
DEF (Design Exchange Format)
The DEF file basically contains the placement information of macros, standard cells, I/O pins
and other physical entities. The logical design data to place and route tool and takes the physical
design data from place and route tool in form of DEF. The logical design data contains the
internal connectivity, grouping information, and physical constraints and the physical design
data contains routing geometry data, placement location and orientation. DEF is used as an
input for various stages.Floorplan DEF is given at the import design stage to provide
information about macro location, IO ports and block shape, SCANDEF is given at the import
design stage for scan chain reordering which contains the connectivity information of scan
flops and it is also an input of scan tracing stage, DEF generated by PnR is used in Star RC
extraction.
In detail it contains:
 Die Area
 Tracks
 Components (macros)
 I/O Pins
 Nets
 Blockages
 Halo
 Scan Chain
 Vias
 Slots
 Fills
 Region
 Row
 Metal layers
Liberty Timing File (LIB)
.lib is basically a timing model contains cell delays, transition, setup and hold time
requirements. CCS and NLDM techniques are used to generate .lib files. In CCS (composite
current source) current source is used for driver modelling, CCS has 20 variables to account
input slew and output load data where as, NLDM uses the voltage source for driver modelling
and it has only 2 variables which are not sufficient for modelling the nonlinearity of any circuit.
So CCS is more accurate than NLDM. Because of the difference in number of variables used
in both the models, size of CCS file is 10X times larger than the NLDM file. also the run time
for CCS is more when compared to NLDM.
The design needs to be tested for certain PVT (process voltage and temperature) corners. But
for every PVT corner, the timing of the cells are different. Hence there is a .lib file for every
PVT corner.
In .lib file following unit attributes are present
 Time unit
 Voltage unit
 Current unit
 Leakage power unit
 Capacitive load unit
 Slew rate : Lower and upper limit values are defined in terms of percentage for both
rise and fall time
 Input threshold at rise and fall time
 Output threshold for rise and fall time
Look Up table templates are defined for different parameters like delay, hold, passive energy,
recovery, removal, setup, with different matrix.
For each cell (AND, NAND, Or etc..) following attributes are defined:
 Area of cell
 Leakage power
 Capacitance
 Rise and fall capacitance
 Properties such as capacitance, direction of the pin etc. for each pin (input and output)
will be defined. Further different values are characterised in matrix form, as shown in
the below example.
fall_transition(delay_template_5x5) {
index_1 (“0.015, 0.04, 0.08, 0.2, 0.4”);
index_2 (“0.06, 0.18, 0.42, 0.6, 1.2”);
values ( \
“0.0606, 0.0624, 0.0744, 0.0768, 0.09”, \
“0.1146, 0.1152, 0.1164, 0.1212, 0.1314”, \
“0.201, 0.2004, 0.2052, 0.2058, 0.2148”, \
“0.48, 0.4806, 0.4812, 0.4824, 0.4866”, \
“0.9504, 0.9504, 0.9504, 0.951, 0.9534”);
Output fall transition is characterized based on output capacitance and input transition. Index_1
represents output capacitance and index_2 represents input transition. In the above example, 5
values are specified in each index, if a given value is not there in the list then we calculate fall
transition by intrapolation or extrapolation. If the value is in between the given values of an
index we go for intrapolation or else we go for extrapolation.
Like “fall transition” other parameter also calculated which are as follows:
 Rise transition
 Internal Power
 Fall power
 Rise power
 Cell fall
 Cell rise
Below an another example of D flip flop characterization table is given, which shows the hold
falling, and setup falling is also defined in addition to the above attributes. Index_1 is
corresponds to related pin transition and index_2 corresponds to constrained pin transition.
timing_type : hold_falling;
rise_constraint(hold_template_3x5) {
index_1 (“0.06, 0.3, 0.6”);
index_2 (“0.06, 0.18, 0.42, 0.6, 1.2”);
values ( \
“-0.09375, -0.0875, -0.075, -0.1125, -0.175”, \
“-0.2, -0.19375, -0.18125, -0.21875, -0.1875”, \
“-0.16875, -0.25625, -0.24375, -0.28125, -0.25”);
timing_type : setup_falling;
rise_constraint(setup_template_3x5) {
index_1 (“0.06, 0.3, 0.6”);
index_2 (“0.06, 0.18, 0.42, 0.6, 1.2”);
values ( \
“0.28125, 0.275, 0.2625, 0.3, 0.3625”, \
“0.29375, 0.2875, 0.36875, 0.3125, 0.375”, \
“0.35625, 0.35, 0.3375, 0.375, 0.4375”);
UPF
Power is one of the most concerned factor in the lower node technologies due to sophisticated
operation of a system at higher frequencies, complex functionalities, wireless applications and
portability. Power dissipation has become one of the critical issues as it results in heating up of
the device which in-turn affects the operation of a chip. There are many kinds of external heat
sinks and software based methods are provided with the system, but we have scope to save the
power during operation of the chip. Saving power is eco-friendly and improves the life time of
the system.
Before going to power saving techniques, lets look at the reasons for power dissipation in
MOSFET based design. The power dissipation is classified in two categories:
 Static power dissipation
 Dynamic power dissipation
1. Static power dissipation:
In this class, power will be dissipated irrespective of frequency and switching of the system. It
is continuous and has become more dominant at lower node technologies. The structure and
size of the device results in various leakage currents. Few reasons for static power dissipation
are:

a. Sub-threshold current
b. Gate oxide leakage
c. Diode reverse bias current
d. Gate induced leakage
It’s hard to find the accurate amount of leakage currents but it mainly depends on supply
voltage (VDD), threshold voltage (Vth), transistor size (W/L) and the doping concentration.
2. Dynamic power dissipation:
There are two reasons of dynamic power dissipation; Switching of the device and Short circuit
path from supply (VDD) to ground (VSS). This occurs during operation of the device.
a) Short-circuit power dissipation:

Because of slower input transition, there will be certain duration of time “t”, for which both the
devices (PMOS and NMOS) are turned ON ( Vtn to VDD-Vtp ). Now, there is a short circuit
path from VDD to VSS. This short circuit power is given by:
Pshort-circuit = Vdd. Isc. t
where, Vdd – Supply voltage, – Short-circuit current,
t – Short-circuit time
b) Switching power dissipation:

This is the power dissipated during charging and discharging of total load [output capacitance
+ net capacitance + input capacitance of driven cell(s)]. The switching power is given by:
Pswitch = α·VDD2 · Cload·f
where, α – Switching activity factor, f – Operating frequency,
VDD – Supply voltage & Cload – Load capacitance
Common power reduction methods are:
 Reduce VDD, Cload, f, α
 Multi voltage design.
 Multi Vth cells (LVT, RVT, HVT cells etc).
 Cells with different drive strengths.
 Dynamic Voltage & Frequency Scaling (DVFS).
 Clock gating (switching power reduction).
 Multi-track cells can be used in a design.
 Multi-bit flipflops can be used.
The power management techniques will start from the design specification stage, and are
employed at each and every step of physical design flow. The below chart shows overview of
power consumption at each stage.

A design has sub-systems with various functionalities. While operating the system, the sub-
functional blocks that are not necessary to function at a particular duration of time can be turned
OFF. Similarly blocks that do not require high speed of operation can be slowed down by
reducing the supply voltage. Some time, the sub-system’s functional performance requirement
varies from time to time (DVFS). All these power reduction methods add complexity to the
design.
UPF provides a universal low power design specification, usually written in Tcl language. The
technique primarily focuses on dynamic power consumption (which is dominant at 90nm).
Here comes the requirement of multi voltage designs (which requires level shifers between
different voltage domains)
As technology shrinks below 90nm, static power consumption has also become prominent.
Here comes the requirement of power gating (which requires isolation cells to isolate a
switching domain from an always on domain.)
To control all these, a power management unit is used, which triggers control signals of low
power cells as per requirement.
The logical intent of the design is completely provided with the help of RTL code but its
complicated to provide power information. Hence the power intent of the design is specified in
UPF. Power management file will be built at the architecture level of design stage. This forms
a complete description of the design. Various methods used for the power management are:
 Clock gating method (ICG) [logic intent of the design]
 Multiple height cells
 Multi-voltage design (MVD)
 Power shut-off (PSO) or Power Gating
 Multi-Vth design (MV)
 Dyamic voltage and frequency scaling (DVFS)
1. Clock gating method:
It is logical intent of the design which is provided in RTL code. Suppose there are a group of
flops meeting “min_bit_width”, having same load enable (data to these flops are constant),
clock switching can be disabled during that time, thereby saving dynamic power to a great
extent. Clock is made available only when the data changes. Clock gating is implemented using
an ICG cell. Read more on clock gating in our synthesis blog.
2. Multi-Voltage design:
As per the equation P = α C V2 f , as supply voltage is scaled down, power reduces to a great
extent. Hence sub-systems that do not require higher speed of operation, can be operated at
lower voltages, saving dynamic power. The design can have multiple voltages as per the
performance requirement.
Sub-systems that operate at different voltages have separate power domains, each having
separate supply ports and nets. This technique requires level shifter when a signal is passed
from one domain to another, based on requirement. There are two types of level shifters:
 Low to high
 High to low


Whenever signal from low domain goes to high domain as input, there will not be full output
swing available at the output of high domain. This is because signal from low domain changes
the region of operations of devices in high domain. So, Low-to-high level shifter is used.
Whenever signal passes from high to low domain, if the destination cell cannot withstand high
voltage, then a H-L level shifter is inserted in that path. The level shifter can be in placed in
source/ destination power domain or in default domain and it will take both the voltages (source
domain voltage and destination domain voltage) for its operation.

3. Power Gating :
Whenever operation of sub-blocks are not required, there is a scope to shutdown voltage
domains. This technique uses power switches to disable power. The power switches are
MTCMOS. During normal operation, LVT is used (to reduce short circuit power) and during
off mode, HVT is used (to reduce leakage power). Power switches are controlled by the power
management unit.
If the load is more, huge amount of in-rush current flows, to charge the internal capacitors. To
reduce this, the power switches are enabled in a daisy chain fashion.
Isolation Cell :
When a source domain (PD1) is in off-mode then its output pin has to be isolated from
destination domain (PD2) to prevent invalid logic being propagated to PD2. Along with
isolation it will save the short circuit power dissipation at the reciever cell.
There are 2 types of isolation cell as per logic requirement:
 “Clamp to 0” cell (AND gate)
 “Clamp to 1” cell (OR gate)
Retention Flop :

Whenever a gated domain is turned off, the state of the flop needs to be retained with less
leakage power. When gated domain is powered back on, the stored data can be used, rather
than initializing again.
This is achieved by using data retention flops. Retention flops contains a DFF and latch. It
requires low power always-on supply to retain the data.
This feature comes with the cost of Area of the device which is more compared to normal flop
and An aditional power supply has to be provided which is low-voltage always ON.
Always ON cell :
Always on cells are special cells which are always turned-on irrespective of their placement in
switching domain. They are used to drive the net which is passing from always on domain.
Generally Always-ON buffers and inverters are used. We need to define the always on cells in
the UPF file.
4. Multi-Vth design (MV)
In a design, standard cells are provided with different flavors based on the threshold voltage.
Variation in threshold voltage will affect the power consumption and timing hence these are
used to optimize the power and timing issues. These cells are usually named as:

 HVT cells
 RVT cells
 LVT cells
This table shows the characteristics of Multi-Vth cells. The area of all the flavours of a cell is
always same. Only threshold voltage varies and hence power and delay.
The design is synthesized with RVT and HVT cells but while optimizing LVT cells are used
to meet the critical timing issues.
5. Dyamic voltage and frequency scaling (DVFS)
This method is used to vary the voltage and frequency based on requirement. The voltage
and/or frequency of the design can be scaled as per performance requirement.
An advance method AVFS has been introduced where the feedback is provided to controller
to decide voltage and/or frequency but it is very complex.
Example: Consider the following design.
This design consists of default with three different voltage domains APD1P2V, SPD1P0V and

APD0P8V.
 APD1P2V – Always on power domain with 1.2V supply
 SPD1P0V – Switching power domain with 1.0V supply
 APD0P8V – Always on power domain with 0.8V supply
 LS_LH – Level shifter low to high
 LS_HL – Level shifter high to low
 ISO – Isolation cell
 RTF – Retension Flop
 PMU – Power management unit
 AON_BUF – Always on buffer
There are various commands provided to specify UPF completely and it can be easily
understandable by command itself, few of which are explained here to write UPF of above
example.
upf_version : As UPF have been modified stage by stage, it has different versions. So its
necessary to provide version of upf being used to interpret the upf commands.
upf_version [string]
The version can be 1.0, 2.0 etc.
Power Domain (PD) : A set of modules using a same voltage belongs a power domain. The
command “create_power_domain” is used to define a power domain and its characteristics.
UPF for the above Power Intent:
#———- Create Power Domains ————–#
create_power_domain TOP -include_scopecreate_power_domain APD1P2V -elements {
TOP/mod1 }
create_power_domain SPD1P0V -elements { TOP/mod2 }
create_power_domain APD0P8V -elements { TOP/mod3 }
#——– Supply Ports & Net Connections ————#
create_supply_port VDD1P2
create_supply_net VDD1P2 -domain TOP
create_supply_net VDD1P2 -domain APD1P2V -reuse
connect_supply_net VDD1P2 -ports VDD1P2
create_supply_port VDD1P0
create_supply_net VDD1P0 -domain TOP
create_supply_net VDD1P0 -domain SPD1P0V -reuse
create_supply_net VDD1P0_SW -domain SPD1P0V #switching net
connect_supply_net VDD1P0 -ports VDD1P0
create_supply_port VDD0P8
create_supply_net VDD0P8 -domain TOP
create_supply_net VDD0P8 -domain APD0P8V -reuse
connect_supply_net VDD0P8 -ports VDD0P8
create_supply_port VSS
create_supply_net VSS -domain TOP
create_supply_net VSS -domain APD1P2V -reuse
create_supply_net VSS -domain SPD1P0V -reuse
create_supply_net VSS -domain APD0P8V -reuse
connect_supply_net VSS -ports VSS
#———- Establish Connection ————-#
set_domain_supply_net TOP -primary_power_net VDD1P0 -primary_ground_net VSS
set_domain_supply_net APD1P2V -primary_power_net VDD1P2 -primary_ground_net VSS
set_domain_supply_net SPD1P0V -primary_power_net VDD1P0 -primary_ground_net VSS
set_domain_supply_net APD0P8V -primary_power_net VDD0P8 -primary_ground_net VSS
#———- Shut-down Logic for Reciever ————#
create_power_switch POWER_SWITCH -domain SPD1P0V \
-input_supply_port {VDD1P0 VDD1P0}\
-output_supply_port { VDD1P0 VDD1P0_SW} \
-control_port {PMU/ps_en } \
-on_state {state_name VDD1P0 {!ps_en}}
#———- Isolation Cell Setting ———–#
set_isolation iso_out -domain SPD1P0V \
-applies_to outputs \
-isolation_power_net VDD1P0 -isolation_ground_net VSS \
-clamp_value 1 \
-isolation_signal PMU/iso_en \
-location default
#———– Retention Logic for SPD ———-#
set_retention RTF -domain SPD1P0V \
-retention_power_net VDD1P0 \
-retention_ground_net VSS \
-save_signal {PMU/rtf_en high} \
-restore_signal {PMU/rtf_en low} \
#——– Level Shifter for multi-VDD Domain ———#
set_level_shifter LS_0P8_1P0 -domain SPD1P0V \
-applies_to inputs \
-location self \
-source APD0P8V.primary \
-input_supply_set APD0P8V.primary -output_supply_set SPD1P0V.primary
set_level_shifter LS_1P0_1P2 -domain APD1P2V \
-applies_to inputs \
-location self \
-source SPD1P0V.primary \
-input_supply_set SPD1P0V.primary -output_supply_set APD1P2V.primary
set_level_shifter LS_1P2_0P8 -domain APD0P8V \
-applies_to inputs \
-location self \
-source APD1P2V.primary \
-input_supply_set APD1P2V.primary -output_supply_set APD0P8V.primary
set_level_shifter LS_0P8_1P2 -domain APD0P8V \
-applies_to inputs \
-location self \
-source APD0P8V.primary \
-input_supply_set APD0P8V.primary -output_supply_set APD1P2V.primary
#———– Define Always ON Cell ————-#
define_always_on_cell -cells AON_BUF \
-power_switchable VDD1P0_SW -ground_switchable VSS \
-power VDD1P0 -ground VSS
#——— Create Power State Table ———–#
add_power_state TOP.primary \
-state ON { -supply_expr {power == ‘ {FULL_ON, 1.0} && ground == ‘ {FULL_ON, 0.0
}} \ -simstate NORMAL }
add_power_state APD1P2V.primary \
-state ON { -supply_expr {power == ‘ {FULL_ON, 1.2} && ground == ‘ {FULL_ON, 0.0
}} \ -simstate NORMAL }
add_power_state SPD1P0V.primary \
-state ON { -supply_expr {power == ‘ {FULL_ON, 1.0} && ground == ‘ {FULL_ON, 0.0
}} \ -simstate NORMAL }
-state OFF { -supply_expr {power == ‘ {OFF} && ground == ‘ {FULL_ON, 0.0 }} \
-simstate CURRUPT }
add_power_state APD0P8V.primary \
-state ON { -supply_expr {power == ‘ {FULL_ON, 0.8} && ground == ‘ {FULL_ON, 0.0
}} \
-simstate NORMAL }
Synthesis
Synthesis is process of converting RTL (Synthesizable Verilog code) to technology specific
gate level netlist (includes nets, sequential and combinational cells and their connectivity).
Goals of Synthesis
1. To get a gate level netlist
2. Inserting clock gates
3. Logic optimization
4. Inserting DFT logic
5. Logic equivalence between RTL and netlist should be maintained
Input files required
1. Tech related:
 .tf- technology related information.
 .lib-timing info of standard cell & macros
2. Design related:
 .v- RTL code.
 SDC- Timing constraints.
 UPF- power intent of the design.
 Scan config- Scan related info like scan chain length, scan IO, which flops are
to be considered in the scan chains.
3. For Physical aware:
 RC co-efficient file (tluplus).
 LEF/FRAM- abstract view of the cell.
 Floorplan DEF- locations of IO ports and macros.
Synthesis steps

Fig1: Synthesis Flow


1. Analyze
 Checks syntax on RTL and generates immediate files.
2. Elaborate
 Brings all lower level blocks into synthesis tool.
 All the codes and arithmetic operators are converted into Gtech and DW (Design Ware)
components. These are technology independent libraries.
 Gtech- contains basic logic gates &flops.
 DesignWare- contains complex cells like FIFO, counters.
 Elaborate performs following tasks;
 Analyses design hierarchy.
 Removes empty switches and dead branches.
 Executes initial commands.
 Detects asynchronous reset.
 Converts decision trees to mux.
 Converts synchronous to Dlatch/DFF.
 FSM pass
 Detects FSM logic and extracts the no of input, output bits and state bits.
 Converts FSM logic to basic logic.
 Memory pass
 Merging DFF to memory write(memwr) and memory read (memrd)
 Consolidating memwr/memrd cells
 Generate memory (mem) cells
 Mapping mem cells to basic logic
3. Import constraints and UPF
Once the design is extracted in the form of technology independent cells, timing constraints
are imported from the SDC file.
If the design consists of multiple power domains, then using the UPF power domains, isolation
cells, level shifters, power switches, retention flops are placed.
4. Clock gating
Due to high switching activity of clock a lot of dynamic power is consumed. One of the
techniques to lower the dynamic power is clock gating. In load enabled flops, the output of the
flops switches only when the enable is on. But clock switches continuously, increasing the
dynamic power consumption.
By converting load enable circuits to clock gating circuit dynamic power can be reduced.
Normal clock gating circuit consists of an AND gate in the clock path with one input as enable.
But when enable becomes one in between positive level of the clock a glitch is obtained.

Fig2: Load enabled register bank


Fig3: Clock gated register bank

Fig4: Waveform for clock gate


To remove the glitches due to AND gate, integrated clock gate is used. It has a negative
triggered latch and an AND gate.
Fig5: Integrated clock gated register bank

Fig6: Waveform for ICG


Clock gating makes design more complex. Timing and CG timing closure becomes complex.
Clock gating adds more gates to the design. Hence min bit width (minimum register bit width
to be clock gated) should be wisely chosen, because the overall dynamic power consumption
may increase.
5. Compile
 Performs Boolean optimization.
 Maps all the cells to technology libraries.
 Performs logic and design optimization.
6. Optimization
 Logic optimization
 Constant folding
 Detect identical cells
 Optimize mux(dead branches in mux)
 consolidate mux and reduce inputs(many to single)
 Remove DFF with constant value
 Reduce word size of the cells
 Remove unused cells and wires
 Design optimization
 Reduce TNS and WNS
 Power Optimization
 Area Optimization
 Meet the timing DRV’s
 incremental clock gating
7. DFT (Design for Testing) insertion
 DFT circuits are used for testing each and every node in the design.
 More the numbers of nodes that can be tested with some targeted pattern, more is the
coverage.
 To get more coverage the design needs to be more controllable and observable.
 For the design to be more controllable we need more control points (mux through which
alternate path is provided to propagate pattern).
 For the design to be more observable we need more observe point (A scan-able flop
that observes the value at that node).
 Scan mode is used to test stuck at faults and manufactured devices for delay.
 Scan mode is done using scan chains
 Scan chains are part of scan based designs to propagate the test data.
 By having scan chains, the design can be more controlable and observable.
 Each scan chain inputs the pattern through scan input and outputs the pattern
through scan output.
 Scan chain consists of scan flops where the output of scanflops is directly
connected to scan inputs of the flops.
 Stages of scan mode
 Inputs the pattern through scan input port.
 Scan shift- Scan enable is set to 1. Then inputs the pattern through the scan
input, shifts the pattern through the scan flops and load all the flops with test
pattern.
 Scan capture- Scan enable is set to 0. In one clock cycle the loaded value in the
flops propagates through combinational circuit and reaches the D pin of the next
flop.
 Scan enable is set to 1 and outputs the pattern through scan output port.
 The scan chain length and number of scan chains has to be properly chosen, as having
more scan chain length increases the pattern propagation time and having more scan
chains increases the number of scan IO ports.

Fig7: Scan chain


8. Compile incremental
 Technology mapping of DFT circuit
 Optimization of the design
9. Outputs of Synthesis
 netlist
 SDC
 UPF
 ScanDEF- information of scan flops and their connectivity in a scan chain
Checklist
 Check if the RTL and netlist are logically equivalent (LEC/FM).
 Check if SDC and UPF are generated after synthesis and also check their completeness.
 Check if there are any assign statements.
 Checks related to timing
 Combinational loops
 Un-clocked registers
 unconstrained IO’s
 IO delay missing
 Un-expandable clocks
 Master slave separation
 multiple clocks
 Checks related to design
 Floating pins
 multi driven inputs
 un-driven inputs
 un-driven outputs
 normal cells in clock path
 pin direction mismatch
 don’t use cells
PD Flow I – Floorplan
PHYSICAL DESIGN – I (Import Design, Floorplan, Placement)
Physical design is process of transforming netlist into layout which is manufacture-able [GDS].
Physical design process is often referred as PnR (Place and Route) / APR (Automatic Place &
Route). Main steps in physical design are placement of all logical cells, clock tree synthesis &
routing. During this process of physical design timing, power, design & technology constraints
have to be met. Further design might require being optimized w.r.t area, power and
performance.
General Physical Design Flow is shown below,
1. IMPORT DESIGN / NETLISTIN
Import design is the first step in Physical Design. In this stage all required inputs & required
references are read into the tool. And also basic checks are done (design, technology
consistency).
Inputs required
1. Gate level netlist
2. Logical (Timing) & Physical views of standard cells & all other IPs used in the design
3. Timing constraints (SDC)
4. Power Intent (UPF / CPF)
5. FP DEF & Scan DEF
6. Technology file
7. RC Co-efficient files
How to qualify Import Design?
1. Check errors & warning while reading netlist. Understand all warnings
2. Check for uniquification & empty modules
3. Check errors & warning while reading timing constraints. Understand all warnings
4. Check errors & warning while reading UPF/CPF. Understand all warnings
5. Timing QoR (Minimal violations with fixable WNS & TNS)
6. Check MV Design (Equivalent to LP checks). Fix all errors & understand all warning
7. Check for assign & tri statements (Usually its checked & fixed after Synthesis)
Timing analyses after Import Design
It is always a good practice to do quick timing analyses after import design. Even though post
synthesis timing analyses is done in timing tool (PT, Tempus/ETS), it’s better to check post
synthesis timing QoR in PnR tools also (ICC, Innovus, Olympus) before actual implementation
starts.
Why it is required?
ICC/Innovus optimizes critical timing paths (violating paths) which are seen by it. There can
be chances that PnR tool is showing a complete different timing QoR (huge violations)
compared to Post Syn QoR seen in PT/Tempus. It can be because of correlation issue /
constraints issue. We can avoid unnecessary optimization; timing & design closure will be easy
if we correlate Import Design timing QoR with Post Syn timing QoR.
2. FLOORPLAN
Floorplan is one the critical & important step in Physical design. Quality of your Chip / Design
implementation depends on how good is the Floorplan. A good floorplan can be make
implementation process (place, cts, route & timing closure) cake walk. On similar lines a bad
floorplan can create all kind issues in the design (congestion, timing, noise, ir, routing issues).
A bad floorplan will blow up the area, power & affects reliability, life of the IC and also it can
increase overall IC cost (more effort to closure, more LVTs/ULVTs)
Before staring of Floorplan, it is better to have basic design understanding, data flow of the
design, integration guidelines of any special analog hard IPs in the design. And for
block/partition level designs understanding the placement & IO interactions of the block in Full
chip will help in coming up with good floorplan.
What is required to come with a good floorplan?
1. Basic design understating
2. Data flow diagram (DFA / Analyze logic connectivity in Synopsys ICC)
3. Integration guidelines
4. IO / Pin placement requirements
5. Special requirements from Full Chip floorplan
6. MV / LP requirements. Understanding of PDs & Vas
Different types of partitions / blocks
1. Memory intensive digital cores, graphic cores
2. Partitions / Blocks with analog Hard IPs
3. DDR & other High Speed Interface partitions / blocks / sub-systems
4. Channel partitions
Partitions with different critical tasks
1. Timing critical
2. Routing critical / Congestion
3. Blocks with complex Clock structure
Types of floorplan techniques used in Full Chip plan
1. Abutted (All inter block pin connections are done through FTs)
2. Non abutted (Channel based. All inter block pin connections are routed in channels)
3. Mix of both – partially abutted with some channels
FLOORPLAN STEPS
1. Size & shape of the block (Usually provided by FC floorplan)
2. Voltage area creation (Power domains)
3. IO placement
4. Creating standard cell rows
5. Macro-placement
6. Adding routing & placement blockages (as required)
7. Adding power switches (Daisy chain)
8. Creating Power Mesh
9. Adding physical cells (Well taps, End Caps etc)
10. Placing & qualifying pushdown cells
11. Creating bounds / plan groups / density screens
Detailed discussion
1. Shape & size of the block / partition
In most of the case, block size & shape is decided by FC floorplan. Rectangle/Square shape is
best in terms of floorplan & further design closure. But in many case, floorplan can be of
rectilinear shape with many notches. It is always good practice to discuss with FC floorplan
team for any scope to improve block/partition level floorplan.
2. Voltage area creation
In multi-voltage & multi power domain designs, voltage areas are required to guide the tool to
understand different domains.
There are two methods to create voltage area;
1. Abutted voltage area (Cells are not allowed to place in default voltage area)
 As is no default domain area, voltage area feed-through (VA-FT) are required
to cross over different voltage areas.
2. Non-abutted voltage area (Cells are allowed to place in default voltage area)

3. IO / Pin placement
IOs / Pins are placed at the boundary of the block. Usually pin placement information is pushed
down from FC floorplan. But these locations can be changed based on block critical
requirements. Any change in pin location has to be discussed with FC floorplan team. Timing
critical interfaces need special attention, like next 2-3 levels of logic from IOs are pre-placed
near the IOs). Source synchronous interfaces requires delay balancing taking OCV into
considerations (This will require manual placement & scripting)
4. Row creation
Rows area created in the design using cell-site (unit / basic). Rows aid in systematic placement
of standard cells. And standard cell power routes done considering rows.

Rows can be cut, wherever cell placement is not allowed OR hard placement blockage can also
be used.
5. Macro placement
Step 1 – Understand Pins & Orientation requirements of Macros
Step 2 – Follow data flow / hierarchy to place the Macros. Make use of reference floorplan
if available

Step 3 – All the pins of the Macros should point towards the core logic
Step 4 – Channels b/w macros should be big enough to accommodate all routing reqs &
should get a minimum of one pair VDD & VSS power grids in the channel
Automatic Floorplan / Macro-placement
Most of the PnR tools provide automatic floorplan option. Automatic floorplan option creates
its own macro placement based on the effort & other options. But these options are not matured
enough to give optimum floorplan for all kind of designs. This option will be handy, when
design has 100s of Macros, but generated floorplan needs lot of modification for further
optimizations.
How to qualify Macro – Placement
1. All macros should be placed at the boundary
2. Check the orientation & pin directions of all macros
3. Spacing b/w macros should be enough for routing & power grid
4. Macros should not block partition level pins
5. [Iterations] Less congestion & good timing QoR – These cannot be achieved in one
shot, but need few iterations [Thorough & deep analyses are the key things while
iterating]
6. Adding placement & routing blockages
Buffer only blockages are added in channels b/w macros. Partial placement blockages can be
added b/w the channels blocking sequential cells (whose placement in channels can degrade
CTS QoR). Partial blockages are added in congestion prone areas/notches/corners

7. Adding power switches


Power switches are required to gate the power supply of gated domain when not required.
Power switches are MT-CMOS (multi-threshold) cells, which will have very high threshold
voltage when device is OFF & very low threshold voltage when device is on.
Power switches are inserted in power mesh & supply to all gated domain cells will be through
power switches. Hence a single / few switches are not enough. A strong network of power
switches connected in daisy chain fashion will be inserted in the design.

8. Adding special cells (Well Taps, EndCaps, Spare Cells, Metal ECO-able cells etc)
Well connection – Almost all standard cell libraries are tap-less (substrate connections are not
done @ cell level). So Well-taps cells are added in partition/chip level to tie the wells to
VDD/VSS. Tap-gate spacing has to be met while adding well-tap array.
Endcap Cells – These cells are inserted to take care of boundary DRC of Wells & Other layers.
End Cap Cells ensure proper terminations of rows, so that no DRC are created. This is a
physical-only cell.
How to qualify Floorplan?
1. Check PG connections (For macros & pre-placed cells only)
2. LP / MV checks on floorplan database
3. Check the power connections to all Macros, specially analog/special macros if any
4. All the macros should be placed at the boundary
5. There should not be any notches / thin channels. If unavoidable, proper blockages has
to be added
6. Remove all unnecessary placement blockages & routing blockages (which might be put
during floor-plan & pre-placing)
7. Check power connection to power switches
8. Check power mesh in different voltage area voltage area
9. Check pin-layers & check layer directions (H-V-H)
PD Flow II – Placement & Optimization
Placement
In this stage, all the standard cells are placed in the design (size, shape & macro-placement is
done in floor-plan). Placement will be driven by different criteria like timing driven, congestion
driven, power optimization etc. Timing & Routing convergence depends a lot on quality of
placement. Different tasks in placement are listed below;
1. Pre-placement
2. Initial placement (Coarse placement)
3. Legalizations
4. Removing existing buffer trees
5. High Fan-out Net Synthesis (HFNS)
6. Iterations of timing/power optimizations [cell sizing, moving, net spitting, gate cloning,
buffer insertion, area recovery]
7. Area recovery
8. Scan-chain re-ordering
9. TIE cell insertions
Goals of placement
1. Timing, Power and Area optimizations
2. Routable design (minimal global & local congestion)
3. No/minimal cell density, pin density & congestion hot-spots
4. Minimal timing DRCs
Before starting the placement optimization, it’s always good practice to do some analyses &
checks on the design & tool settings. This would definitely help in design converge & reduce
iterations.
Things to be checked before placement
1. Check for any missing / extra placement & routing blockages
2. Don’t use cell list & whether it is properly applied in the tool
3. Don’t touch on cells & nets (make sure that, these are applied)
4. Better to have limit the local density (Otherwise local congestion can create issue in
routing / eco stages)
5. Understand all optimization options & placement switches set in the tool
6. There should not be any high WNS timing violations
7. Make sure that clock is set to ideal network
8. Take care of integration guidelines of any special IPs (These won’t be reported in any
of the checks). Have custom scripts to check these guidelines
9. Fix all the hard macros & pre-placed cells
10. Check the pin access
Pre-placement
1. Spare cell insertion / Metal ECO-able cells
2. Magnet placement (IOs / any other interface)
3. Custom / manual placement of special cells (very specific to design)
4. Insertion of De-Caps (Not everyone follows this)
5. Antenna diodes & buffers on block level ports
HFNS
All high fan-out nets will be synthesized (buffer tree) except clock nets & nets with don’t touch
attribute. Scan-enable and reset are few examples of high fan-out nets. HFNS honors max fan-
out setting.
Different Timing optimization techniques
Timing converge is one of key task in placement optimization. If timing QoR is bad, then
placement cannot be qualified. Bad timing QoR at placement stage would create difficulties in
timing convergence in further stages.
1. Assigning more weight to critical group path
2. Timing driven placement– high effort
3. Allowing LVT cells for optimizations (<5% of low / ultra low VT cells)
In most of the designs only 15-25% of the paths will be timing critical. So giving more weight
to these critical paths during optimization will aid in optimizing critical path delays. This can
be achieved by creating group paths and assigning more weight to the critical paths.
If design is timing critical, then timing driven-placement strategy has to chosen with high
effort of optimization (trade-off with runtime). But timing-driven placement is some design
can create local congestion hot-spots & also global congestion will increase. Cell-padding,
density screens, partial blockages and bounds can be used to reduce/fix these congestion issues.
Controlled usage of low-VT cells will help in optimizing timing critical paths. Most of the PnR
tools have the option to control VT usage.
Congestion reduction techniques
1. Cell padding
2. Use of density screens, placement blockages
3. Congestion driven placement (with high effort @ cost of runtime)
Congestion is one the major challenge in PNR of high/medium utilization designs. Placement
is first & key step where congestion analysis begins & it should be under control. Both global
& local congestion should be minimal with no local hotspots. A though analyses of congestion
map, cell density map & pin density will be help in deciding the quality of placement.
Local congested hot-spots are very common in timing critical, high utilization designs. Cluster
of AOI/OAI (Boolean function cells) / any high pin density cells will cause local hot-spots.
Power Optimization
Nowadays most of the designs are targeted to achieve less power consumption. It’s because of
growing demand of hand-held battery operated devices (smart phones, tabs) & IOT. So we
should keep an eye on static & dynamic power dissipation and make effort to reduce power
dissipation.
Dynamic power:
Transition & Load capacitance are the two key parameters which can be controlled in
placement stage to get optimum dynamic power. Iteration can be performed to arrive at
optimum max transition & max capacitance. Most of the tools have option to optimize the
power.
Dynamic power dissipation is directly proportional to toggle rate (switching activity). So to get
maximum benefit power optimization should be done on nets with high toggle rate. ‘Low
power placement’ helps to identify the net/cells with high toggle rates & load capacitance (wire
length) is optimized (reduced) to reduce power dissipation.
Leakage power:
High VT & Regular VT cells will have less leakage power compared to low & ultra low VT
cells. So it’s good idea to block / allow partial usage of low & ultra VT cells.
Scan chain Re-Ordering
DFT tool flow makes a list of all the scan-able flops in the design, and sorts them based on
their hierarchy and perform scan stitching (clock domains, maximum chain length constraints
will be considered). Scan-chain at this stage will not be layout friendly.
In APR tool scan chains are reordered on the basis of placement of flops & Q-SI routing. This
is nothing but scan-chain reordering. Scan-chain reordering helps to;
 Reduce congestion, Total wire-length
 Require fewer repeaters in Q-SI path
Below diagram shows pre-layout scan-chain stitched based on the hierarchy.

If scan chain reordering is not done, congestion & net/wire length will increase. Below diagram
shows details:

Same flop placement with scan-chain reordered has better congestion & wire / net lengths are
reduced. Refer below diagram:
What if the design has different power domains?
Placement flow is almost same. But in case of Abutted voltage area designs, an extra stage
“Voltage Area Feed-through” is required, before placement stage.
Following tasks are done in VA-FT stage:
1. Enabling VA-FT creation in tool flow
2. Quick placement of the design (Requirement of VA-FT will known only after
placement of all standard cells)
3. Global route (To identify where all VA-FTs are required)
4. VA-FT creation
5. Disable VA-FT
6. Continue with place & optimizations

An example of FT port creation & FT buffer addition through different voltage areas (power
domains) is shown in below diagram;
How to qualify placement
1. Logical equivalence check & low power checks
2. Check legalization
3. Check PG connections of all the cells
4. Check congestion, place density & pin density maps. All these should be under control
5. Timing QoR / Convergence. There should not be any high WNS violations & TNS,
NVP must be under control
6. Minimal max tran & max cap violations
7. Check whether all don’t touch cells & nets are preserved
8. Check for don’t use cells (Should be Zero/ same as post Syn)
Clock Tree Synthesis- part 1
Clock Tree Synthesis (CTS) is one of the most important stages in PnR. CTS QoR decides
timing convergence & power. In most of the ICs clock consumes 30-40 % of total power. So
efficient clock architecture, clock gating & clock tree implementation helps to reduce power.
Sanity checks need to be done before CTS
 Check legality.
 Check power stripes, standard cell rails & also verify PG connections.
 Timing QoR (setup should be under control).
 Timing DRVs.
 High Fanout nets (like scan enable / any static signal).
 Congestion (running CTS on congested design / design with congestion hotspots can
create more congestion & other issues (noise / IR)).
 Remove don’t_use attribute on clock buffers & inverters.
 Check whether all pre-existing cells in clock path are balanced cells (CK* cells).
 Check & qualify don’t_touch, don’t size attributes on clock components.
Preparations
 Understand clock structure of the design & balancing requirements of the designs. This
will be help in coming with proper exceptions to build optimum clock tree.
 Creating non-default rules (check whether shielding is required).
 Setting clock transition, capacitance & fan-out.
 Decide on which cells to be used for CTS (clock buffer / clock inverter).
 Handle clock dividers & other clock elements properly.
 Come up with exceptions.
 Understand latency (from Full chip point of view) & skew targets.
 Take care of special balancing requirements.
 Understand inter-clock balancing requirements.
Difference between High Fan-out Net Synthesis (HFNS) & Clock Tree Synthesis:
 Clock buffers and clock inverter with equal rise and fall times are used. Whereas HFNS
uses buffers and inverters with a relaxed rise and fall times.
 HFNS are used mostly for reset, scan enable and other static signals having high fan-
outs. There is not stringent requirement of balancing & power reduction.
 Clock tree power is given special attention as it is a constantly switching signal. HFNS
are mostly performed for static signals and hence not much attention to power is needed.
 NDR rules are used for clock tree routing.
Why buffers/inverters are inserted?
 Balance the loads.
 Meet the DRC’s (Max Tran/Cap etc.).
 Minimize the skew.
What is the difference between clock buffer and normal buffer?
 Clock buffer have equal rise time and fall time, therefore pulse width violation is
avoided.
 In clock buffers Beta ratio is adjusted such that rise & fall time are matched. This may
increase size of clock buffer compared to normal buffer.
 Normal buffers may not have equal rise and fall time.
 Clock buffers are usually designed such that an input signal with 50% duty cycle
produces an output with 50% duty cycle.
CTS Goals
 Meet the clock tree DRC.
 Max. Transition.
 Max. Capacitance.
 Max. Fanout.
 Meet the clock tree targets.
 Minimal skew.
 Minimum insertion delay.
Clock Tree Reference
By default, each clock tree references list contains all the clock buffers and clock inverters in
the logic library. The clock tree reference list is,
 Clock tree synthesis.
 Boundary cell insertions.
 Sizing.
 Delay insertion.
Boundary cell insertions
 When you are working on a block-level design, you might want to preserve the
boundary conditions of the block’s clock ports (the boundary clock pins).
 A boundary cell is a fixed buffer that is inserted immediately after the boundary clock
pins to preserve the boundary conditions of the clock pin.
 When boundary cell insertion is enabled, buffer is inserted from the clock tree reference
list immediately after the boundary clock pins. For multi-voltage designs, buffers are
inserted at the boundary in the default voltage area.
 The boundary cells are fixed for clock tree synthesis after insertion; it can’t be moved
or sized. In addition, no cells are inserted between a clock pin and its boundary cell.

Fig1: Boundary cell


Delay Insertion
 If the delay is more, instead of adding many buffers we can just add a delay cell of
particular delay value. Advantage is the size and also power reduction. But it has high
variation, so usage of delay cells in clock tree is not recommended.
Clock Tree Design Rule Constraints
 Max. Transition.
 The Transition of the clock should not be too tight or too relaxed.
 If it is too tight then we need more number of buffers.
 If it is too relaxed then dynamic power is more.
 Max. Capacitance.
 Max. Fanout.
Clock Tree Exceptions
 Non- Stop Pin
 Exclude Pin
 Float Pin
 Stop Pin
 Don’t Touch Subtree
 Don’t Buffer Nets
 Don’t Size Cells
Non- Stop Pin:
 Nonstop pins trace through the endpoints that are normally considered as endpoints of
the clock tree.
 Example :
 The clock pin of sequential cells driving generated clock are implicit non-stop
pins.
 Clock pin of ICG cells.

Fig2: Non Stop pin


Exclude pin:
 Exclude pin are clock tree endpoints that are excluded from clock tree timing
calculation and optimization.
 The tool considers exclude pins only in calculation and optimizations for design rule
constraints.
 During CTS, the tool isolates exclude pins from the clock tree by inserting a guide
buffer before the pin.
 Examples:
 Implicit exclude pin-
 Non clock input pin of sequential cell.
 Multiplexer select pin.
 Three-state enable pin.
 Output port.
 Incorrectly defined clock pin [if pin don’t have trigger edge info.].
 Cascaded clock.
Fig3: Exclude pin
 In the above figure, beyond the exclude pin the tool never perform skew or insertion
delay optimization but does perform design rule fixing.
Float Pin:
 Float pins are clock pins that have special insertion delay requirements and balancing
is done according to the delay.[Macro modelling].

Fig4: Float pin


Stop Pin:
 Stop pins are the endpoints of clock tree that are used for delay balancing.
 CTS, the tool uses stop pins in calculation & optimization for both DRC and clock tree
timing.
 Example:
 Clock sink are implicit stop pins.
Fig5: Stop pin
The optimization is done only upto the stop pin as shown in the above figure.
Don’t Touch Sub-tree:
 If we want to preserve a portion of an existing clock tree, we put don’t touch exception
on the sub-tree.

Fig6: Don’t touch subtree


 CLK1 is the pre-existing clock and path 1 is optimized with respect to CLK1.
 CLK2 is the new generated clock. Don’t touch sub-tree attribute is set w.r.t C1.
 Example:
 If path1 is 300ps and path2 is 200ps, during balancing delay are added in path2.
 If path1 is 200ps and path2 is 300ps, during balancing delay can’t be added on
path1 because on path1 don’t touch attribute is set and we get violation.
Don’t Buffer Net:
 It is used in order to improve the results, by preventing the tool from buffering certain
nets.
Note: Don’t buffer nets have high priority than DRC.CTS do not add buffers on such nets.
 Example:
 If the path is a false path, then no need of balancing the path. So set don’t buffer
net attribute.
Don’t Size Cell:
 To prevent sizing of cells on the clock path during CTS and optimization, we must
identify the cell as don’t size cells.
Specifying Size-Only Cells:
 During CTS & optimization, size only cells can only be sized not moved or split.
 After sizing, if the cells overlap with an adjacent cell after sizing, the size-only cell
might be moved during the legalization step.
Implementing Clock Tree:
For implementing the clock tree, use the clock-opt which performs CTS & incremental
physical optimization.
 Synthesizes the clock Tree:
 Before implementing the clock tree, the tool upsize & possible moves the
existing clock gate which improves the quality of result (QoR) and reduce the
number of clock tree levels.
 Optimize the Clock Tree: Is done by following steps
 Buffer relocation.
 Buffer sizing.
 Gate relocation.
 Gate sizing.
 Improve skew.
 Delay insertion.
 Perform inter-clock delay balancing
 Balancing has to be done between two flops driven by two different clocks.
 Clock groups between which balancing have to be performed need to be
specified.
 Perform detail routing of clock nets [NDR rule].
 Apply non default routing (NDR) rules for clock nets.
 Double width.
 Double spacing.
 Shielding
 By default the tool applies routing rules for sink pin by default. It is better to
use normal routing rules at the sink pin because to reduce the congestion and
tapping of clock might be easy.
Fig7: Non Default Routing
 Perform RC extraction of the clock nets and compute accurate clock arrival time.
 Adjust the I/O timings.
 After implementing the clock tree, the tool can update the input and output
delays to reflect the actual clock arrival time.
 Perform power optimization.
 Use a large/Max clock gating fanout during insertion of the ICG cells.
 Merge ICG cells that have the same enable signal.
 Perform power-aware placement of ICG and registers.
 Check and fix any congestion hotspots.
 Optimize the scan chain.
 Fix the placement of the clock tree buffers and inverters.
 Perform placement and timing opt.
 Check for major hold time violation.
Post CTS Optimization
During Clock tree synthesis, buffers or inverters are added in the clock nets to achieve
minimum Insertion delay and Skew, while meeting the clock DRV’s. Various optimizations
are performed during CTS such as CCDO (Concurrent Clock and Data Optimization) and CTO
(Clock Tree Optimization) . Once the CTS optimizations are done, the clock tree is fixed and
routed. Further optimizations cannot be done on the clock tree except buffer sizing or gate
sizing. Hence, post CTS, only data path can be optimized. The various post-CTS optimizations
include : meeting DRV’s, Setup & Hold, Area & Power optimization, Congestion reduction.
DRV’s : Design Rule Violations
1. Max Tran
2. Max Cap
3. Max Fanout
Causes :
 HVT cells give slower transition : The HVT cells have larger threshold voltages
compared to LVTs and RVTs. Hence, they take more time to turn ON resulting in larger
transition time.
 Weak Driver : The driver won’t be able to drive the load resulting in bad transition of
the driven cell. Thus the delay increases.
 Load is more : The driving cell cannot drive load more that what it is characterized for.
This is set in .lib using max cap value. If the load that a cell sees increases beyond its
maximum capacitance value, then it causes bad transition and hence increases delay.
 Net length is large : Larger the net length, larger the resistance, worser the
transition.Thus results in trans violation. The RC Value of a long net will increase the
load seen by a cell causing max cap violations as well.
 Fanout is too large : If the fanout number increases beyond the limit of what the driver
cell in characterized for, it causes max fanout violations. The increased load results in
max cap violation which indirectly causes max tran violation as well.
Fixes :
1. Max Tran :
 Replace HVT cells with LVT cells.
 Upsize the driver.
 Reduce the net length by adding buffers. Longer the nets, larger the resistance.
Putting a buffer at the middle of a long net splits the resistance into half.
 Reduce the load by reducing fanout and downsizing the driven cell.
2. Max Cap :
 Upsize the driver.
 Split long nets by buffering.
 Reduce the load by reducing the fanout (by load splitting) or by downsizing the
driven cell.
3. Max Fanout :
4. Reduce the fanout by load splitting by buffering or cloning.

 Fig. (a) shows a buffer driving four other cells. In fig. (b), the load is split using
Cloning. The first buffer is cloned and each buffer now drives half of the load.
In fig.(c), the load is split using buffering. Two new buffers are added at the
output of buffer A. Now buffer A is driving C1 and C2 and each of them are
driving half of the load.
SETUP :
Reasons for Setup Violations:
 Tcomb :
 Tcomb delay is high.
 High RC metal might be used in Tcomb for routing which increases the net
delay.
 More HVT Cells in data path.Lower drive strength cells in data path.
 Tsetup of capture flop is more.
 More negative skew : Launch clock is late and capture clock is early.
 Crosstalk delay :Signals switching in opposite direction resulting in more delay.
Fixes :
 Vt swapping : Replace HVT cells with LVT/ULVT cells.
 Upsize drivers in data path.
 For long nets, if adding a buffer can reduce RC, improve transition and hence improve
timing, then add buffers.
 Reduce fanout.
 Layer optimization in data path : Use higher metals with lower RC Values to route in
data path. This is preferred only if the timing path is critical.
 Fix cross talk using NDR Rules during routing stage.
HOLD :
Reasons for Hold Violations:
 Tcomb delay is less due to :
 Move LVTs and ULVTs in data path.
 High drive strength drivers in datapath.
 Told of capture is more.
 More positive skew.
 Cross Talk : Signals switching in same direction makes the data arrive early.
Fixes :
 Vt swapping : Replace LVT/ULVT cells with HVT cells.
 Add buffers in data path to increase data path delay.
 Downsize drivers in data path.
 Layer optimization in data path : Use lower metals with higher RC Values to route in
data path.
 Fix cross talk using NDR Rules during routing stage.
AREA AND POWER OPTIMIZATION:
Need for area and power optimization:
 Clk cells are larger than normal cells. Hence, they take more area and consume more
power.
 LVTs are used in clock path as they have less on chip variations and less short circuit
power. But they have more subthreshold leakage power.
 Clock is a high switching net. Hence , it has more switching power.
Fixes :
 Area Optimization :
 Downsize Clock buffers if a smaller sized clock buffer can drive the same load.
 Power Optimization :
 Downsize Clock buffers if a smaller sized clock buffer can drive the same load.
 Replace HVTs with LVTs/ULVTs in datapath.
CONGESTION :
Causes :
The addition of extra buffers during CTS to achieve minimum skew and minimum insertion
delay can cause congestion.
Fixes :
 Post CTS, we can’t move any clock cells. So, for a well optimized design Post CTS,
we have to do a proper congestion driven placement keeping in mind the ulitization
post CTS in the initial stages itself.
 Cell padding : In congestion prone area, cell padding should be applied for standard
cells.
Routing
Routing is the stage after Clock Tree Synthesis and optimization where-
 Exact paths for the interconnection of standard cells and macros and I/O pins are
determined.
 Electrical connections using metals and vias are created in the layout, defined by the
logical connections present in the netlist.
After CTS, we have information of all the placed cells, blockages, clock tree buffers/inverters
and I/O pins. The tool relies on this information to electrically complete all connections defined
in the netlist such that-
 There are minimal DRC violations while routing.
 The design is 100% routed with minimal LVS violations.
 There are minimal SI related violations.
 There must be no or minimal congestion hot spots.
 The Timing DRCs are met.
 The Timing QoR is good.
The different tasks that are performed in the routing stage are as follows-
 Global Routing (also performed during placement stage).
 Track assignment.
 Detailed Routing.
 Search and Repair.
Goals of Routing
 Minimize the total interconnect/wire length.
 Maximize the probability that the tool can complete routing.
 Minimize the critical path delay.
 Minimize the number of layer changes that the connections have to make (minimizing
the number vias).
 Complete the connections without increasing the total area of the block.
 Meeting the Timing DRCs and obtaining a good Timing QoR.
 Minimizing the congestion hotspots.
 SI driven: reduction in cross-talk noise and delta delays.
Routing Prerequisites
 All the design rules required during the routing stage must be defined in the technology
file.
 The design must be placed and optimised. CTS and optimization should be complete.
 The PG nets must be pre-routed and physically connected to all macros and standard
cells.
 The timing DRC violations and the timing QoR, estimated after CTS must be
acceptable.
 The measured congestion should be tolerable.
 There should not be any ideal nets.
 High fanout nets should not be greater than the specified limit.
 Check for any optimization that needs to be done to fix any errors.
 Checking routability.
 After the placement and clock tree synthesis stage we must check if the design is ready
for routing. The checks performed are as follows-
 Check if the ports of the standard cells are blocked i.e. the physical pins are not
accessible.
 Checks for overlapping cells in the design. Overlapping causes pins to short and
cause metal DRC violations.
 Check for pins underneath PG routes (they may be inaccessible and cause
violations on metals) .
 Check if the ports of the top-level or macro cell are blocked and physically
inaccessible.
 Check for pins that are outside the design boundary (Out-of-Boundary pins).
 Check for blocked PG ports.
 Check if there are frozen nets blocking ports.
 Check for blocked unconnected pins.
 Check if all pins in the design are on the routing tracks.
Routing Constraints
 Setting routing constraints guides the tool during routing. The constraints to be set are
as follows-Set constraints to number of layer to be used during routing.
 Setting the maximum length for the routing wires.
 Set stringent guidelines for minimum width and minimum spacing.
 Set preferred routing directions to specific metal layers during routing.
 Constraining off-grid routing.
 Blocking routing in specific regions.
 Setting limits on routing to specific regions.
 Setting precedence to routing regions.
 Constraining the routing density.
 Constraining the pin connections.
 Restricting the degree of rerouting.
Global Routing
Global routing is a coarse-grain assignment of routes, which first partitions the routing region
into tiles/rectangles called global routing cells (gcells) and decides tile-to-tile paths for all nets
while attempting to optimize some given objective function (e.g., total wire length and circuit
timing), but doesn’t make actual connections or assign nets to specific paths within the routing
regions. By default, the width of a gcells is same as the height of a standard cell and is aligned
with the standard cell rows.
Blockages, pins, and routing tracks inside the cell, dictate the routing capacity for every gcell.
Then all nets assigned to the gcell are noted and the demand for the wire tracks in each gcell
are calculated and overflows are reported.
Global routing is done in two stages namely-
 The initial routing stage, wherein the unconnected nets are routed and overflow for each
gcell is calculated.
 Rerouting stages, where the congestion around gcells with net overflows are reduced
by ripping off and rerouting the net.
After the initial routing stage and each rerouting stage, design statistics and congestion data are
reported. A summary of wire length and via count at the end of the Global routing stage.
There are three types of Global Routing namely-
 Time-Driven Global Routing- The net delays are calculated before global routing.
 Cross-Talk Driven Global Routing- Avoids the creation of long tile-to-tile paths that
run parallel on adjacent tracks.
 Incremental Global Routing- Performed using existing global route information.
Track Assignment
Track assignment is a stage wherein the routing tracks are assigned for each global routes. The
tasks that are performed during this stage are as follows-
 Assigning tracks in horizontal and vertical partitions.
 Rerouting all overlapped wires.
Track Assignment replaces all global routes with actual metal layers. Although all nets are
routed(not very carefully), there will be many DRC, SI and timing related violations, especially
in regions where the routing connects the pins. These violations are fixed in the succeeding
stages.
Detail Routing
The detailed router uses the routing plan laid by the router during the Global Routing and Track
Assignment and lays actually metal to logically connect pins with nets and other pins in the
design. The violations that were created during the Track Assignment stage are fixed through
multiple iterations in this stage.
The main goal of detailed routing is to complete all of the required interconnect without leaving
shorts or spacing violations. The detailed routing starts with the router dividing the block into
specific areas called switch boxes or Sbox, which are generally expressed in terms of gcells.
These boxes align with the gcell boundary. For example, a 3x3 Sbox is a box which encompass
9 gcells.
Search And Repair
The search-and-repair stage is performed during detailed routing after the first iteration. In
search-and-repair, shorts and spacing violations are located and rerouting of affected areas to
fix all possible violation is executed.
Routing optimization and Chip Finishing
ROUTING OPTIMIZATION
 Routing optimization is a step performed after detailed routing in the flow.
 Inaccurate modeling of the routing topology may cause timing, signal integrity and
logical design constraint related violations.
 This may cause conditions wherein fixing a violation would create other violations and
many such scenarios may cascade to make it very difficult for timing closure with no
timing DRCs.
 Hence it is necessary to fix and optimize the routing topology.
 Routing optimization involves-
 Fixing timing violations.
 Fixing LVS (opens & shorts).
 Fixing DRCs.
 Fixing Timing DRCs (Meet max transition, max capacitance and max fanout).
 Finding & Fixing Antenna violations (using jumpers and antenna diodes).
 Area and Leakage power recovery.
 Fixing SI related issues.
 Redundant via insertion.
This post will concentrate on Finding & Fixing Antenna violations and Redundant via
insertion. The other topic will be covered in subsequent posts.
Antenna Violations
 During IC fabrication, the wafer usually undergoes various processing steps, such as
metalization (laying of metal wires) and etching (to make the surface flat). During the
metalization step, few of the nets connecting the gate terminals can be floating as upper
metal layers have not been fabricated yet. In the plasma etching process (widely used
in recent fabrication processes), there may be accumulation of unwanted electrostatic
charges on these floating nets, which act as antennas.
 Typically, in ICs nets are driven either from source or drain of the device and connects
a receiver gate terminal over a gate oxide. Gate oxides being thin for higher technology
nodes, are susceptible to electrostatic discharges and are in danger of getting ruptured
due to higher potentials than the breakdown voltage.
 When these charges flow through the devices it can rupture the gate of the device, there
by leading to a total chip failure. This phenomenon of an electrostatic charge being
discharged into the device is known as “antenna effect”.
 Every FAB have their own set of rules (depends on the type of technology the ICs are
being fabricated) to avoid antenna violations during IC design.
In order to prevent antenna problems, tools verify that for each input pin the metal antenna area
divided by the gate area is less than the maximum antenna ratio given by the foundry:
(Antenna area) (Gate area) < (Max Antenna ratio)

Gate area: The are of transistors which is the intersection of the diffusion and the polysilicon
layers.
Antenna area: Total area of metal connected to gate terminal.
Max Antenna Ratio: Maximum allowable ratio of Antenna area to Gate area.
Fixing antenna violations
There are many techniques to fix antenna violations. The widely used techniques are described
in this post.
Antenna Diode:
 Reverse biased diodes (Zener diodes) inserted close to the gate terminal to provide a
discharge path for the electrostatic charges during plasma etching.
 This reverse biased diode will not affect the operation of the circuit as it will conduct
only if the potential reaches its breakdown potential, thus protecting the respective gate
from damage.
 The general rule of practice is to use an n-type diode as p-type diodes requires extra n-

well biasing.
Fig1: Antenna Diode.
Limitations of using antenna diodes:
 Wastage of core area – If the number antenna violations is large, the overuse of antenna
diodes eats up the core area meant for standard cells.
 These diodes eat up extra placement and routing silicon resources which becomes a
costly during fabrication.
 Potential Forward biasing of diode – The antenna diodes are usually placed in a back
to back fashion where there is an antenna violation. If they are not biased (reverse
biased) in a correct way, there is a potential of one of the diodes to get forward biased.
This is usually seen in low power designs when one of the power source gets switched
off. One such case is shown in the example figure below. Here D2 is always in a reverse
biased condition. But in the case of D1, if the cathode is turned off, it gets forward
biased due to the reverse biasing of D2. Hence, a lot of care must be takes such that
both anode and cathode are in the same potential or the anode must be at ground
potential.

Fig2: Back to Back connected Antenna diode.


 Leakage Power – Antenna diodes contribute to the total leakage power which may
affect the working of the design. Hence overuse of these diodes must be avoided.
Use of Jumpers (metal Hopping):

Fig3: Jumpers (metal hopping)


 One of the most widely used ways to fix antenna violations during routing optimization.
 By definition, antenna violations occur in cases where the nets connected to gate
terminals are long, which can lead to a large amount of static charge accumulation
leading to damage of gate oxide.
 This can be fixed by using jumpers which is basically jumping to higher metal layers
to keep the total length of the lower metal layer directly connecting the gate terminals
below the maximum length and also floating as the charges don’t flow into the gate
terminal.
 During fabrication only nets that are in the lower metal layer are fabricated first, thus
leaving a gap (floating) where the jumping to higher metal layers occurs. This reduces
the area for static charge accumulation and hence keeps the gate relatively safe until the
upper metal layer is stacked and connected to the other part of the lower metal layer.
 Using jumpers nearer to the gate terminal makes the device relatively safer from
antenna effect.
 It is to be noted that jumping from a higher metal layer to lower metal layer cannot be
done as lower layers are already fabricated and this causes the flow of static charges
leading to the damage of devices connected.
 In the case of antenna violations on analog blocks, jumpers are to be added very close
to the analog pins.

Fig4: Jumper insertion for analog block.


 For high fanout nets such as reset and scan enable, with large number of sinks, the
location of jumpers must be selected such that fixing violation at a point adjacent to a
group of sinks via jumpers fixes multiple violations on the group, instead of
individually fixing them.

Fig5: Jumper insertion for High Fanout nets.


 To make use of jumpers at the point of violation make sure that there are no routing
blockages for the higher metal layer in that region.
 It is to be noted that using metal jumpers are the most widely used method to fix antenna
violations during post-route optimization and antenna diodes are used as last resort
where jumpers cannot be added in regions where routing blockages are present or
adding them may cause other violations such as SI related or congestion at higher
layers.
Redundant Via Insertion / Double Via Insertion.
 During IC fabrication, there may be partial or complete via failure due to various
reasons such as cut misalignment, electro migration and thermal stress induced voiding.
 A partial via failure increases the resistance of the signal nets and may increase the
delay causing difficulty in timing closure. Whereas a complete via failure will result in
broken nets.
 This leads to increase in yield loss and hence is critical to fix this issue.
 To fix yield loss due to via failure, a good practice is the insertion of redundant vias
adjacent to single vias for support.
 Redundant vias are basically extra single vias on minimal width nets.
 It is to be noted that redundant vias are required for support to increase the yield, unlike
muli-cut vias on wide nets for functionality purposes.
 Although redundant via insertion is a step in Design for Manufacturability (DFM) it is
more advantageous to insert redundant vias in the post-route optimization stage or
during the detailed routing stage. This is because during the DFM stage the layout is
almost complete and can be modified only slightly without causing other cascaded
violations. Hence this restricts the insertion of redundant vias at single via points,
resulting in an increasing chance of yield loss due to via failure.
 If the amount of redundant vias inserted are not guided, the yield and reliability of the
design may be adversely affected due to pattern distortion of these vias may become as
serious problem.
 The advantages of redundant vias are-
 Nets are less likely to break.
 Yield is improved.
 Decrease in the resistance of the vias.
 Avoids increase of net delay due to partial via failure.

Fig6: Redundant via insertion.


Chip Finishing
Chip finish is a stage after post-route optimization, where filler cells and metal fills are added
to meet the DRC rules. The different steps in chip finish are briefly described below.
Inserting Filler Cells
 Filler cells are used for rail continuity and to fill up gaps between standard cells in the
rows, thereby reducing the DRC violations created by the base layers.
 Filler cells are physical-only cells designed in such a way that they contain only n-well,
p-well & power rails.
 Although the standard cells have implant layers in them, meeting the width and spacing
DRC rules may not be feasible without abutting filler cells with respective implant
layers.
 It is also possible to reduce the IR drop by inserting de-cap filler cells, but this comes
at a cost of higher leakage currents.
Inserting Metal Fills (Dummy Metal Fills)
 The metal fills (also called dummy metal fills) are small, floating metal nets, inserted
after post-route optimization in order to maintain uniformity in metal layer density, in
empty spaces in the design.
 These are added to meet the metal density DRC rules which are mandatory by most
manufacturing processes.
 The process of adding metal fills is as follows-
 To maintain uniformity for any metal layer, we have window based density
rules.
 For each window metal density will be specified.
 If the utilization of metal in a window is less than that given in DRC rule deck,
then we add dummy metals to overcome density DRC.
 The prerequisites for adding metal fills are as follows-
 Design must contain minimum timing violations.
 Design must meet the timing DRCs (Meet max transition, max capacitance and
max fanout).
 Design must have minimum or no DRC violation.
 Design must have no LVS violation (opens & shorts).
 There must be minimum SI and antenna related violations.
SignOff checks
Design Rule Check (DRC)
Design Rule Check (DRC) determines whether the layout of a chip satisfies a series of
recommended parameters called design rules. Design rules are set of parameters provided by
semiconductor manufacturers to the designers, in order to verify the correctness of a mask set.
It varies based on semiconductor manufacturing process. These rule set describes certain
restrictions in geometry and connectivity to ensure that the design has sufficient margin to take
care of any variability in manufacturing process.
Design rule checks are nothing but physical checks of metal width, pitch and spacing
requirement for the different layers with respect to different manufacturing process. If we give
physical connection to the components without considering the DRC rules, then it will lead to
failure of functionality of chip, so all DRC violations has to be cleaned up.
After the completion of physical connection, we check each and every polygon in the design,
based on the design rules and reports all the violations. This whole process is called Design
Rule Check.
Typical DRC rules are:
1. Interior
2. Exterior
3. Enclosure
4. Extension
Interior:

Fig1: Distance of interior facing edge for a single layer.

Fig2: Distance of interior facing edge of two layer.


Exterior:

Fig3: Distance of exterior facing edge of two layer


Enclosure:

Fig4: Distance between inside edge to outside edge.


Extension:

Fig4: Distance between inside edge to outside edge.


To understand the basic design rules, let’s take CMOS inverter as an examples:

Fig5: CMOS Inverter layout


Layout versus Schematic (LVS)
DRC only verifies that the given layout satisfies the design rules provided by the fabrication
unit. It does not ensure the functionality of layout. Because of this, idea of LVS is orginated.
This blog focuses on how LVS works and what all are the common issues faced in LVS.
How LVS works
Inputs needed to perform LVS are:
 .v – netlist of the design
 GDS – layout database of the design
 LVS rule deck
.v and GDS should be of same stage.
LVS rule deck is a set of code written in Standard Verification Rule Format (SVRF) or TCL
Verification Format (TVF). It guides the tool to extract the devices and the connectivity of
IC’s. It contains the layer definition to identify the layers used in layout file and to match it
with the location of layer in GDS. It also contains device structure definitions.
LVS check involves three steps:
1. Extraction: The tool takes GDSII file containing all the layers and uses polygon based
approach to determine the components like transistors, diodes, capacitors and resistors
and also connectivity information between devices presented in the layout by their
layers of construction. All the device layers, terminals of the devices, size of devices,
nets, vias and the locations of pins are defined and given an unique identification.
2. Reduction: All the defined information is extracted in the form of netlist.
3. Comparison: The extracted layout netlist is then compared to the netlist of the same
stage using the LVS rule deck. In this stage the number of instances, nets and ports are
compared. All the mismatches such as shorts and opens, pin mismatch etc.. are reported.
The tools also checks topology and size mismatch.

Fig6: LVS Flow


Commonly faced LVS issues:
LVS check includes following comparisons:
 Number of devices in schematic and its layout
 Type of devices in schematic and its layout
 Number of nets in schematic and its layout
Typical errors which can occur during LVS checks are:
1. Shorts: Shorts are formed, if two or more wires which should not be connected together
are connected.
2. Opens: Opens are formed, if the wires or components which should be connected
together are left floating or partially connected.
3. Component mismatch: Component mismatch can happen, if components of different
types are used (e.g, LVT cells instead of HVT cells).
4. Missing components: Component missing can happen, if an expected component is left
out from the layout.
5. Parameter mismatch: All components has it’s own properties, LVS tool is configured
to compare these properties with some tolerance. If this tolerance is not met, then it will
give parameter mismatch.
IR Drop Analysis:
IR Drop can be defined as the voltage drop in metal wires constituting power grids before it
reaches the vdd pins of the cells. IR drop occurs when there are cells with high current
requirement or high switching regions. IR drop causes voltage drop which in-turn causes the
delaying of the cells causing setup and hold violations. Hold violations cannot be fixed once
the chip is fabricated.
There are two types of IR drop analysis namely:
Static IR drop analysis:
 Calculates the average voltage drop of entire design assuming current drawn across is
constant.
 As average current is calculated this analysis depends on time period. This analysis is
good for signoff checks in older technology.
Dynamic IR drop analysis:
 Depends on switching activity of the logic.
 Is vector dependent .
 Less dependent on clock period as depends on instantaneous current.
 Analysis of peak current demand and highly localized cells.
Methods to reduce IR drop:
 Robust power mesh– Initial power grid is made based on static ir analysis due to late
availability of switching activity. If there is IR drop due to some of the clustered cells
then adding a strip will make the power mesh more robust.
Fig7: Custom power rail added to make it robust
 De-cap– These are decoupling capacitors which are spread across the high switching
region to maintain the voltage.
 Spacing– If clock cells are clustered and causing IR drop, then by spacing them apart
near to different power rails will reduce the IR drop. While shifting the cell to next
power rail, it should be made sure that the power rail is not driving many cells, because
adding another cell may give IR drop.

 Reducing load– Cells driving more load will be drawing more current. Hence reducing
load will reduce IR drop.
 Downsizing– Cells of smaller size will draw less current. But the transition of cells
should not become worse.
 The number of power switches can be increased to reduce IR drop
 It should be made sure that all the power pins of macros are properly connected to the
power rails.
Note:
 For accurate dynamic analysis vcd files (switching activity file) with sdf (standard delay
format) is better.
 Glitches produced from combinational circuit may act as instantaneous switch.
Reducing them will decrease the pessimism of dynamic IR drop analysis.
 IR drop analysis is done in RC worst corner (corner having more resistance of rails)
and FF process, high voltage and high temp corner (PVT corner) because current is
drawn more in this corner.
ELECTRO MIGRATION (EM):
Electro migration (EM) refers to the unwanted movement of materials in a semiconductor. If
the current density is high enough, there can be a momentum transfer from moving electrons
to the metal ions making the ions to drift in the direction of the electron flow. This results in
the gradual displacement of metal atoms in a semiconductor, potentially causing open and short
circuits. Due to high current density and resistance of metal in the recent technologies EM has
become dominating.
 EM leads to open circuits due to voids in wires or vias and leads to short circuits due to
extrusions or “hillocks” on wires. Either can cause a system failure that is hard to
diagnose.
 During older technology nodes EM was considered only on power wires and clock
wires. But now signal wires also need to be considered due to increased current density
in them.
 Fin-FETs have more current density than planar transistors, thus making EM worse,
especially in conjunction with narrow wires.
 Copper interconnects worsen EM because the copper molecule moves faster.
 In the recent technologies the lower supply voltages is helping to reduce EM, but not
enough to offset all the other causes that amplify it.
 EM is worse at higher temperatures.
 EM fixing techniques such as widening wires, can increase area and cause timing
violations. EM fixing needs to be timing-driven.
Methods to fix EM
1. Widen the wire to reduce current density
2. Keep the wire length short
3. Reduce buffer size in clock line
Not in PD point of view
1. Reduce the frequency
2. Lower the supply voltage
SCAN Tracing:
Files (Scan DEF and .V)
In scan tracing we are checking the connection of flip flops, there should not be any floating
connections. The reason why we are doing scan tracing is because, in formality check we
disable the Scan (so it doesn’t check the scan chain), and we are assuring that there is no issue
with scan chains.
DFM:
Files (GDS and Rule deck file)
As the technology scales down, manufacturing process is becoming more complex. DFM
(design for manufacturing) is the stage in which we modify or add extra things (like redundant
via insertion, wire spreading). These techniques will increase the yield and reliability in the
design.
Few DFM steps:
1. Redundant Via insertion.
2. Wire spreading.
3. Wire slotting.
4. Metal filling.
Formality Check:
Files (Reference netlist, Implemented Netlist, .V and .Lib)
The basic idea behind FM check is to compare implemented netlist with reference netlist
(Synthesis stage netlist / golden netlist). We check whether the logic output value given in both
stages are same.
Example 1: If we check the FM in the Scan mode (i.e, in ON state) we will get the formality
issues, because during the scan chain reordering the position of Flip Flops will be changed with
respect to SCAN def file. To overcome this issue, we have to disable the scan port (by assigning
it’s value to “0”).
Example 2: Undriven port Issue: In golden netlist for the floating pins binary values are
assigned like “0” or ”1”, but when it gets implemented floating pin is assigned as “X ” which
leads to mismatch. To resolve this issue, we set both pins in implemented and reference netlist
either “0” or “1”.
Power Analysis:
Files (SPEF, SAIF,.V, Lib, UPF and SDF)
In Power analysis we calculate the power dissipation. Two types of power dissipation, (i)
Leakage Power (ii) Dynamic power. Leakage power is basically static power which is dissipate
during the Off state or non-toggling (when the input data is fixed) state of device, and for the
dynamic power the activity factor is required, which is present in the SAIF (switching activity
interchange format) file. We also check for hot spot in the design, the hot spot is basically the
small region where the higher power dissipation is present.

You might also like