Clock Gating Methodology for Power and CTS QoR
Agenda
Objective Introduction to clock gating Clock gating methodology
Overview RTL synthesis Physical synthesis Clock tree synthesis Summary of recommendations
Sample results Planned enhancements Summary
Objective
Describe the clock gating methodology to meet target
Skew Insertion delay Power
Discuss recommendations during
RTL synthesis using Design Compiler Physical synthesis using IC Compiler or Physical Compiler Clock tree synthesis using IC Compiler or Astro
Agenda
Objective Introduction to clock gating Clock gating methodology
Overview RTL synthesis Physical synthesis Clock tree synthesis Summary of recommendations
Sample results Planned enhancements Summary
What is Clock Gating?
Register banks disabled during some clock cycles
Typical implementation uses multiplexers Clock gating cell replaces multiplexers
D EN CLK
Q EN CLK High activity
gclk Low activity
Benefits of Clock Gating
Dynamic power savings
With low toggle rate on clock pin, internal power of registers is reduced Gated by the enable signal, the clock network has less switching activity and consumes less switching power
Area savings
Eliminating multiplexers saves area
Easy to implement
No RTL code change is required Clock gating is automatically inserted by the tool Technology independent
Agenda
Objective Introduction to clock gating Clock gating methodology
Overview RTL synthesis Physical synthesis Clock tree synthesis Summary of recommendations
Sample results Planned enhancements Summary
Clock Gating Methodology Overview
Design Compiler Design Compiler
Input RTL
Insert clock gating Insert clock gating Compile Compile IC Compiler IC Compiler Physical Compiler Physical Compiler Merge clock gates Merge clock gates Placement and placement Placement and placement optimization optimization Astro Astro Replicate clock gates Replicate clock gates Clock tree synthesis Clock tree synthesis Detail routing Detail routing
Merge clock gates Merge clock gates Placement and placement Placement and placement optimization optimization Replicate clock gates [BETA] Replicate clock gates [BETA] Clock tree synthesis Clock tree synthesis Detail routing Detail routing Design Compiler X-2005.09 IC Compiler v1.1 Physical Compiler X-2005.09 Astro X-2005.09
Unified Flow in IC Compiler
Agenda
Objective Introduction to clock gating Clock gating methodology
Overview RTL synthesis
Methodology Clock gating considerations
Physical synthesis Clock tree synthesis Summary of recommendations
Sample results Planned enhancements Summary
10
Clock Gating Methodology During RTL Synthesis
Set the clock gating style Set the clock gating style set_clock_gating_style set_clock_gating_style Read in Verilog Read in Verilog read_verilog read_verilog Define the clocks Define the clocks create_clock create_clock Insert clock gating Insert clock gating insert_clock_gating insert_clock_gating Compile Compile compile compile
Input RTL
RTL Synthesis
11
Specify Clock Gating Options
Use the set_clock_gating_style command Maximum fanout
This value is the maximum fanout of each clock gating element By default, the fanout is unlimited
Minimum bitwidth
This is the minimum bitwidth of register banks that will be gated By default, the minimum bitwidth is 3 No area or power benefit with register banks with bitwidth less than 3
RTL Synthesis
12
Insert Clock Gating During RTL Synthesis
Use the insert_clock_gating command
The -global option looks across hierarchical boundaries for the common enable
Module A Module A
Extra ports added
d1 a b
EN
d1 a b
EN
CG
clk
Module B
CG
clk d2
Module B
d2
EN
CG
Top
Top
Regular clock gating
RTL Synthesis
Hierarchical clock gating
13
Measure the Quality of Inserted Clock Gating: Report Power and Clock Gating
Use the report_power command
Cell Internal Power Net Switching Power Total Dynamic Power = 160.6544 mW = 102.5581 mW --------= 263.2125 mW (61%) (39%) (100%)
Cell Leakage Power = 3.0961 mW
Use the report_clock_gating command
Clock Gating Summary -----------------------------------------------------------| Number of Clock gating elements | 222 | | | | | Number of Gated registers | 167512 (99.92%) | | | | | Number of Ungated registers | 137 (0.08%) | | | | | Total number of registers | 167649 | ------------------------------------------------------------
RTL Synthesis
14
Agenda
Objective Introduction to clock gating Clock gating methodology
Overview RTL synthesis
Methodology Clock gating considerations
Physical synthesis Clock tree synthesis Summary of recommendations
Sample results Planned enhancements Summary
15
Clock Gating Considerations
Clock gate styles Enable signal timing
Ensure that you meet the setup and hold time on the enable pin of clock gate
Impact of clock gate fanout on
Power and enable pin timing Clock tree structure
RTL Synthesis
16
Clock Gate Styles
Integrated, latch-based, clock gate (ICG) is recommended Discrete, latch-based or latch-free (simple AND or OR-AND gate) clock gates are also supported
Discrete clock gates are not recommended (details on next slide)
Latch-based clock gates prevent a glitch on the enable from being propagated to the gated clock
D EN CLK GCLK
CLK EN GCLK
No glitches on gated clock
RTL Synthesis
17
Integrated Versus Discrete Clock Gating
Integrated clock gate
EN CLK GCLK
Discrete clock gate
EN GCLK CLK
No clock skew between latch and AND gate Timing analysis and CTS handle the clock gate automatically Setup and hold check modeled in library Easy to use in the flow
Ensure minimum skew between latch and AND gate Specify latch clock pin as a non stop pin for CTS Specify the setup and hold time This adds complexity to the flow
Integrated clock gating is recommended
RTL Synthesis
18
Enable Signal Timing
Setup time on the enable pin of clock gate Synthesis assumes that the clock signal arrives at all registers and clock gates at same time (within skew) Clock signal reaches the clock gating cell earlier than it reaches the registers Timing constraints on the enable signals need to be adjusted Note: The closer the clock gating cell is to the registers, the less constrained the enable signal
CLK
EN CLK
CG
( )
RTL Synthesis
( + )
19
Impact of Clock Gate Fanout
Clock gate fanout is determined by
The -max_fanout option of the set_clock_gating_style command in Design Compiler By default, the fanout is unlimited
Impact of clock gate fanout on
Power and enable pin timing Clock tree structure
RTL Synthesis
20
Impact of Clock Gate Fanout on Power and Timing
Large max fanout Small max fanout
ICG
ICG ICG ICG
ICG
Fewer clock gating cells Better power reduction More constrained enable
RTL Synthesis
Easier to meet enable pin timing Power might be affected
21
Impact of Clock Gate Fanout on Clock Tree Structure
Large max fanout
60
ICG ICG
Small max fanout
60 30 30 27
ICG
300
ICG
ICG ICG
108
ICG
27 8
ICG
Unbalanced clock structure Depending on design skew requirement, may need processing for CTS QoR
RTL Synthesis
More balanced clock structure Easier to meet CTS QoR
22
Impact of Clock Gate Fanout Summary
By default, max fanout is unlimited
Results in best power savings and reasonable CTS QoR
If CTS QoR is a higher priority,
Make your clock structure as balanced as possible set_clock_gating_style minimum_bitwidth value \ -max_fanout value
Use similar value for min_bitwidth and max_fanout
Balance fanout of each clock gate Eliminate small fanout Select the value based on your design Experiments have shown that using a balanced fanout of 128 or 256 results in improved CTS QoR
RTL Synthesis
23
Agenda
Objective Introduction to clock gating Clock gating methodology
Overview RTL synthesis Physical synthesis Clock tree synthesis Summary of recommendations
Sample results Planned enhancements Summary
24
Clock Gating Usage During Placement Optimization
Large or unlimited fanout
By default, no group bounds are created for the clock gate and its fanout during placement Avoid congestion around the clock gate You will get better overall timing QoR
Placement
of the registers is based on timing Not constrained by location of clock gate
Small fanout
To keep the clock gate and its register fanout together during placement, use
set physopt_disable_auto_bound_for_gated_clock false
Helps meet timing of the enable pin
Physical Synthesis
25
Optimizing the Clock Structure in a Gate-Level Design
Consider the following scenarios:
Clock gate insertion done during RTL synthesis with small fanout Gate-level netlist with clock gates from a third party and with small clock gate fanout
To improve power, you can
Optimize or minimize the clock gates in your design Run merge_clock_gates on your design
Physical Synthesis
26
Merging Clock Gates
Gate-level design
Merges clock gates that share a common enable
Identify clock gates Identify clock gates identify_clock_gates identify_clock_gates Merge clock gates Merge clock gates merge_clock_gates merge_clock_gates Placement optimization Placement optimization
Only required in a Verilog-based flow
Clock tree synthesis Physical Synthesis
27
Agenda
Objective Introduction to clock gating Clock gating methodology
Overview RTL synthesis Physical synthesis Clock tree synthesis Prepare your clock structure for CTS Replicate clock gates Summary of recommendations
Sample results Planned enhancements Summary
28
Prepare the Clock Structure for CTS
Complex clock gating presents a challenge for CTS. You can Insert always enabled clock gates Add always enabled clock Replicate clock gates gates to create a more
balanced tree
ICG
60
ICG
Replicate clock gates
ICG ICG ICG
60 34 28
ICG ICG ICG
300
ICG
31 28
108
ICG ICG
25
ICG ICG
25 8
8 Clock Tree Synthesis
29
Creating More Balanced Clock Structures During RTL Synthesis
EN1 ICG EN1 ICG
EN2 ICG
EN2 ICG Active High ICG
To enable, use
set power_cg_all_registers true
Also set the following variable
set power_remove_redundant_clock_gates false
RTL Synthesis
30
What is Replicate Clock Gates?
25 Balances fanout by fixing DRC at the output of the ICG
ICG
25
ICG
25
ICG ICG
20 108
ICG ICG
31 25 32
ICG
Adds buffers to drive registers that are not gated
25
Same engine used for clustering in clock tree synthesis and clock gate replication Clock Tree Synthesis
31
What Does Replicate Clock Gates in Astro and IC Compiler do?
Replicates clock gate with new instances using the same reference cell Balances the fanout of clock gates based on design rule constraints Considers the location of registers In Astro, marks the output net of the clock gate as synthesized
Astro CTS does not modify the net IC Compiler CTS checks the net for a DRC violation, but does not modify the net if it is DRC clean
Inserts buffers to drive registers that are not gated The number of clock gates increases
Clock gates are larger than clock buffers and consume more power Impact on power and area
Clock Tree Synthesis
32
When to Replicate Clock Gates?
Placed design Yes Replicate clock gates Replicate clock gates Unbalanced clock structure ? No Check other factors
Only when needed
Clock tree synthesis Clock tree synthesis
Meet target skew ? Yes Detail routing Clock Tree Synthesis
No
33
Prerequisites for Replicating Clock Gates in Astro
1. Ensure that you have logically equivalent cells (LEQs) in the reference library This allows the sizing of ICGs 2. Set the DRC constraints Use the astClockOptions command 3. To enable the insertion of buffers to drive registers that are not gated, use the following command: axSetIntParam "acts" "push down clock ports" 1 4. If you want to prevent the tool from using certain ICG cells Define the design LEQs (see the appendix for details)
Clock Tree Synthesis
34
Prerequisites for Replicating Clock Gates in IC Compiler
1. Ensure that you have logically equivalent cells (LEQs) in the reference library This allows the sizing of ICGs 2. Set the DRC constraints Use the set_clock_tree_options command 3. To enable insertion of buffers to drive registers that are not gated, set the following variable: set cts_push_down_buffer true 4. If you want to prevent the tool from using certain ICG cells, set dont_use on the cells
Clock Tree Synthesis
35
Using astSplitClockNet in Astro
File contains either
- Instance names of the cells to be replicated - Nets names (all fanout on specified nets are processed)
astSplitClockNet setFormField Split Clock Net" "Clock Gated Cells File Name" split.txt" formOK Split Clock Net
Clock Tree Synthesis
36
Using split_clock_net in IC Compiler
split_clock_net objects object_list -gate_sizing gate_relocation
The object_list is a list of instances or nets whose fanout is to be replicated Enable sizing or relocation of ICGs
Clock Tree Synthesis
37
Creating Balanced Clock Fanout at RTL Versus Replicate Clock Gates Before CTS
Balanced Clock Fanout Replicate Clock Gates at RTL
When? Why? Insert clock gating at RTL synthesis. CTS QoR is a priority. Enable pin timing is a priority. Replicate clock gates before CTS. Selected maximum fanout at RTL synthesis for maximum power savings. Need to preprocess clock structure to meet target skew. DRC at output of clock gate (includes input capacitance of registers and net capacitance) Clustering based on placement location
Based on
Clock gate fanout
38
Agenda
Objective Introduction to clock gating Clock gating methodology
Overview RTL synthesis Physical synthesis Clock tree synthesis Summary of recommendations
Sample results Planned enhancements Summary
39
Recommendations for RTL Synthesis
Select the maximum fanout based on your design priority Large fanout gives you more power savings Balanced fanout gives good CTS QoR Use integrated, latch-based clock gating cells
40
Recommendations for Physical Synthesis/CTS
Physical synthesis
Use group bounds only when the maximum fanout is small
Clock tree synthesis
Replicate clock gates only if necessary Use DRC constraints to control the number of replicated clock gates
41
Agenda
Objective Introduction to clock gating Clock gating methodology
Overview RTL synthesis Physical synthesis Clock tree synthesis Summary of recommendations
Sample results Planned enhancements Summary
42
Sample Results: Design 1
Design details
90nm, 160MHz clock, 181K instances, 37 macros Target skew 150ps
Flow highlights
RTL synthesis No max fanout constraint
Insert clock gating
(default: unlimited) Insert always active clock gating cells No group bounds
Total power without 48mW clock gating
Physical synthesis Clock tree synthesis
Results
Final skew Final power 141ps 27mW
With replication of clock gates
*See sample scripts in the appendix
Achieved target skew with replication of clock gates
43
Sample Results: Design 2
Design details
90nm, 85MHz clock, 39K instances, 1 macro Target skew 100ps
Flow highlights
RTL synthesis No max fanout constraint
Insert clock gating
(default: unlimited) Insert always active clock gating cells No group bounds
Total power without 21mW clock gating
Physical synthesis Clock tree synthesis
Results
Final skew Final power 91ps 16mW
No replication of clock gates
*See sample scripts in the appendix
Achieved target skew without replication of clock gates
44
Agenda
Objective Introduction to clock gating Clock gating methodology
Overview RTL synthesis Physical synthesis Clock tree synthesis Summary of recommendations
Sample results Planned enhancements Summary
45
Planned Enhancements for Clock Gating Methodology
Astro and IC Compiler
Improved QoR with clock gating Create a more balanced clock structure before doing CTS Create a clock tree with equal levels of logic to each sink
IC Compiler only
Use clock gate optimization to optimize the timing of the enable pin after CTS
46
Agenda
Objective Introduction to clock gating Clock gating methodology
Overview RTL synthesis Physical synthesis Clock tree synthesis Summary of recommendations
Sample results Planned enhancements Summary
47
Summary
Understand the power and CTS requirements of your design Choose the clock gating methodology based on your design requirements
Use integrated clock gating Process the clock structure based on your CTS and power requirements Select the right fanout of clock gates during RTL synthesis Use merge and replication of clock gates only if necessary
48
Appendix
Sample scripts Summary of clock gating methodologies Overview of clock gating methodology using ASCII interchange format How to handle enable signal timing Equivalence checking in Formality Clock gating and design-for-test Details on replicate clock gates Additional considerations with discrete clock gating
49
Sample DC Script
#Set clock gating options, max_fanout default is unlimited set_clock_gating_style -sequential_cell latch \ -positive_edge_logic {integrated} \ -control_point before \ -control_signal scan_enable #Create a more balanced clock tree by inserting always enabled ICGs set power_cg_all_registers true set power_remove_redundant_clock_gates true read_db design.gtech.db current_design top link source design.cstr.tcl #Insert clock gating insert_clock_gating compile #Generate a report on clock gating inserted report_clock_gating
50
Sample IC Compiler Script
#Open the Milkyway design open_mw_lib design_lib.mw open_mw_cel top current_design top link #Placement & placement optimization place_opt #Set clock tree options set_clock_tree_options
clock_tree Clk \ max_capacitance 0.3 \ -max_transition 0.3
#Replicate clock gates split_clock_net object_list *latch* gate_sizing gate_relocation #Clock tree synthesis and optimization clock_opt
51
Sample Astro Script
#Open the Milkyway design geOpenLib setFormField "Open Library" "Library Name" design.mw" formOK "Open Library" geOpenCell setFormField "Open Cell" "Cell Name" top" formOK "Open Cell #Set clock tree options astClockOptions setFormField "Clock Common Options" "Maximum Transition Delay 0.3 setFormField "Clock Common Options" "Maximum Load Capacitance" 0.3 formOK "Clock Common Options" #Replicate clock gates astSplitClockNet setFormField "Duplicate Clock Gated Cells" "Clock Gated Cells File Name" split.lst" formOK "Duplicate Clock Gated Cells" #Clock tree synthesis astCTS formOK "Clock Tree Synthesis"
52
Format of file for astSplitClockNet
Line separated list of instances or net names Allows wildcard .* Example:
cg_latch_inst_1 cg_latch_inst_2 cg_latch_inst_3
53
Design LEQs in Astro
Define design LEQs
astLoadDesignLEQ file_name
Example:
cell1 cell2 cell2 cell3 cell4 cell5
cell1, cell2, and cell3 are in the same class cell4 and cell5 are in the same class
Clear/dump design LEQs
astClearDesignLEQ astDumpDesignLEQ
54
Summary of Clock Gating Methodologies
Unlimited Clock Fanout at RTL
When? Insert clock gating at RTL synthesis. Power is a priority. CTS QoR, enable pin constraints more flexible.
Balanced Clock Fanout at RTL
Insert clock gating at RTL synthesis. CTS QoR is a priority. Enable pin timing is a priority.
Replicate Clock Gates
Replicate clock gates before CTS. Selected maximum fanout at RTL synthesis for maximum power savings. Need to preprocess clock structure to meet target skew. DRC at output of clock gate (includes input capacitance of registers and net capacitance) Clustering based on placement location
Why?
Based on
Clock gate fanout
Clock gate fanout
55
Clock Gating Methodology Overview Using ASCII Interchange Format (Verilog)
Design Compiler Design Compiler
Input RTL
Insert clock gating Insert clock gating Compile Compile IC Compiler IC Compiler Physical Compiler Physical Compiler Identify clock gating cells Identify clock gating cells Merge clock gates Merge clock gates Placement and placement Placement and placement optimization optimization Astro Astro Replicate clock gates Replicate clock gates (astSplitClockNet) (astSplitClockNet) Clock tree synthesis Clock tree synthesis Detail routing Detail routing Skew analysis Skew analysis
Identify clock gating cells Identify clock gating cells Merge clock gates Merge clock gates Placement and placement Placement and placement optimization optimization Replicate clock gates [BETA] Replicate clock gates [BETA] (split_clock_net) (split_clock_net) Clock tree synthesis Clock tree synthesis Detail routing Detail routing Skew analysis Skew analysis
56
How to Handle Enable Signal Timing
Estimate delay of clock tree after clock gating cell before synthesis to avoid timing problems later It can be modeled through the clock gate setup check
set_clock_gating_style -setup (ideal_setup + ) propagate_constraints -gate_clock
CLK
It can also be modeled by specifying a clock latency for the clock and then a modified clock latency for all the clock gate clock pins
set_clock_latency 1.7 CLK This is the delay seen at the input of any ungated register set_clock_latency 1.1 $ICGClkInputPins This is the delay seen at the input of the clock gates set_clock_latency 1.7 $ICGClkOutputPins This is the delay seen at the input of the gated registers
Registers
CG
( )
( + )
57
Formal Verification
The Synopsys formal verification tool, Formality, can perform equivalence checking when the design has inserted clock gating cells The following command instructs Formality to account for clock gating logic
fm_shell > set verification_clock_gate_hold_mode any
58
Clock Gating and Test
Controllability Observability Test signal connections
59
Potential Loss of Coverage
Logic not observable
Levels of design hierarchy
Data in Data out D Q Flipflops
Di
Q Enable logic
EN D Q ENCLK Latch G
Flipflops CLK
Clock is not controllable = not tested = partially tested = fully tested
60
Test Coverage With Scan Enable
scan_enable 0 during capture Levels of design hierarchy
Control point
Di D Q Control logic EN
Data in Data out D Q Register bank
Flipflops CLK
ENCLK
Latch G
= not tested = partially tested = fully tested
61
Test Coverage With Test Mode
test_mode 1 Levels of design hierarchy
Control point
Di D Q Enable logic
Data in Data out D Q Register ENCLK bank
Flipflops CLK
D EN
Latch G
= not tested = partially tested = fully tested
62
Complete Observability
EN3 Other observability nodes EN2
Observe flop
CLK
EN1
D testmode Q dataout
EN
Latch
CLK
Unobservable point
63
Test Signal Connections
SE1
CG1
FF
SE2 SE3
CG1
FF
hookup_testports se_port SE3 hookup_testports [-verbose] [-se_port port] [-tm_port port] [-se_pin pin] [-tm_pin pin]
64
Details on Replicate Clock Gates: Pictorial Description
Insertion of buffer to drive ungated registers
Replication of ICG
Load on ICG: 2pf Load on each ICG: 0.25pf (< Max Cap of 0.3pf)
8 ICGs
DRC fixed on the output of each instance In Astro, net is marked as synthesized In IC Compiler, net is not marked as synthesized
65
Details on Replicate Clock Gates: Inputs, Constraints and Behavior
Inputs
Requires a list of nets or instances If a net is specified, all instances on the fanout of the net are processed
Constraints
The replication of the specified instances is based on fixing DRC at the output of each instance The DRC constraints considered are maximum fanout, maximum capacitance and maximum transition The tool converts maximum fanout and maximum transition into equivalent capacitance values, and uses the tightest of the three capacitance values as the maximum capacitance constraint
Behavior
The tool splits the specified instance as many times as is necessary to fix the DRC on the output of each clock gate
66
Details on Replicate Clock Gates: Example1
Consider the following scenario:
Root clock net clk drives 1000 ungated registers Clock gate cg1, which drives 2000 registers Clock gates cg2, which drives 3000 registers You would like the clock gates driven by net clk to be balanced based on a maximum capacitance constraint of 0.35 Set the following DRC constraints: set_clock_tree_options max_capacitance 0.35 split_clock_net object clk
~80 ICGs 1000 registers
Solution
2000 registers
~120 ICGs 3000 registers Load on each ICG < 0.35pf Fanout of each ICG ~ 25
67
Details on Replicate Clock Gates: Example2
Consider the following scenario:
Root clock net clk drives 1000 ungated registers Clock gate cg1, which drives 2000 registers Clock gate cg2, which drives 3000 registers You would like the clock gates driven by net clk to be balanced based on a maximum capacitance constraint of 0.35 You would like to make the clock structure more balanced by inserting a buffer to drive the ungated registers Set the following DRC constraints: set_clock_tree_options max_capacitance 0.35 set cts_push_down_buffer true split_clock_net object clk
1000 registers
Solution
~80 ICGs
2000 registers ~120 ICGs
Load on each ICG < 0.35pf 3000 registers Fanout of each ICG ~ 25
68
Details on Replicate Clock Gates: Example3
Consider the following scenario:
Root clock net clk drives 1000 ungated registers Clock gate cg1, which drives 2000 registers Clock gate cg2, which drives 3000 registers You would like the clock gates driven by net clk to be balanced based on a maximum fanout constraint of ~1000 Set the following DRC constraints (specify a large maximum capacitance and maximum transition constraint, so that the tool chooses the maximum fanout constraint as the tightest constraint) set_clock_tree_options \ max_capacitance 10000 \ max_transition 10000 \ max_fanout 1000 split_clock_net object clk
1000 registers
Solution
1000 registers
2 ICGs
3 ICGs 2000 registers Fanout of each ICG ~1000 3000 registers
69
Details on Replicate Clock Gates: Example4
Consider the following scenario:
Root clock net clk drives 1000 ungated registers Clock gate cg1, which drives 200 registers Clock gate cg2, which drives 3000 registers Clock gate cg3, which drives 195 registers You would like the clock gates driven by net clk to be balanced based on a maximum fanout constraint of ~200 Replicate the clock gate cg2 such that the fanout of each replicated instance is ~200 set_clock_tree_options \ max_capacitance 10000 \ max_transition 10000 \ max_fanout 200 split_clock_net object cg2
Solution
1000 registers
200 registers
1000 registers
~15 ICGs
200 registers 3000 registers Fanout of each ICG ~ 200
195 registers
195 registers
70
Additional Consideration With Discrete Clock Gating Cells
Clock skew between latch and AND gate
skew delay EN A CLK B CLK@ A EN1 GCLK EN EN1 CLK@ B GCLK glitch!
Clock at B later than A Skew > latch delay
71
Using Discrete Clock Gating Cells
In Design Compiler and Physical Compiler,
Do not ungroup the clock gating hierarchy Enable group bounds to place the elements of the clock gate (latch and AND gate) close together
set physopt_disable_auto_bound_for_gated_clock false
In Astro,
Place the latch and AND gates close together Specify a large netweight on the net Get the clock to go through the latch, that is, ignore the CLK pin of the latch as a sync pin Use the astSetClockNonStop command Refer to SolvNet article 003097