Clock Enable Timing Closure
Methodology
Harish Dangat
Samsung Semiconductor
(company logo
if desired)
Agenda
Basics of Clock Gating
Fixing Clock Enable Timing in RTL-2-GDSII Flow
Results
Conclusion
Harish Dangat
2
Clock Gating Basic
Use internal (or external)
signal to disable clock
This saves Dynamic
Power
A must for low power
design
Creates new timing paths
Harish Dangat
3
Two Types of Clock Gating
Using AND gate
Using ICG Cell
Rest of presentation is about ICG type clock gating
Harish Dangat
4
Register to Register Path
Harish Dangat
5
Register to Register Path
with Clock Gating
CE Path
D
EN
1ns
CE clk Path
0.5ns
1ns
Clock gated clk Path
Harish Dangat
What is different about CE path
Not noticed at Synthesis
Timing available is less than cycle time
ICG cells are not skew balanced with registers
Violations are seen only after Clock Tree Synthesis
Mostly affects timing critical blocks
Harish Dangat
7
Effect of ICG Cells Location
in Clock Tree
0ns
0.25ns
CLK
Potential bad
Location CE timing
Architectural Gaters
0.5ns
1ns
0.75ns
Acceptable Location
Good
Location
Harish Dangat
8
Agenda
Basics of Clock Gating
Fixing Clock Enable Timing in RTL-2-GDSII
Flow
Results
Conclusion
Harish Dangat
9
What to Do at RTL Level
CE signal should be generated in the same
module
Generate CE signal from functionally related
modules
Simplify the logic that generates CE signal
Harish Dangat
10
CE Timing at Synthesis Step
Reduce cycle time to ICG cells
set_clock_latency
-(cycle_time/2) \
[get_pin all_clock_gating_registers/CK]
set_clock_latency
[get_pin all_clock_gating_registers/ECK]
Set high setup time on ICG cells
set timing_scgc_override_library_setup_hold true
set_clock_gating_style setup 400ps clock_gate
Turn off bus sharing in Power Compiler
set_clock_gating_style no_sharing
Harish Dangat
11
CE Timing at Floorplan Step
When placing modules, pay attention to CE
signal connectivity
If CE signal(s) are input pins, place them close
to modules that receive it
CE
CE
Harish Dangat
CE timing 12
problem
Good CE timing
CE Timing at placement Step
Tightening available cycle time by changing ICG
setup time
set timing_scgc_override_library_setup_hold true
set_clock_gating_style setup 400ps clock_gate
Tightening available cycle time by changing ICG
clock latency
set_clock_latency
-(cycle_time/2) \
[get_pin all_clock_gating_registers/CK]
set_clock_latency
[get_pin all_clock_gating_registers/ECK]
Harish Dangat
13
CE Timing at placement Step (cont)
Create group path and add extra weight
group_path
-weight 5
-name CLOCK_ENABLE \
to [get_cell */*GATE_LATCH]
Place ICG cells close to flops
set placer_disable_auto_bound_for_gated_clock false
Harish Dangat
14
How to Select Latency?
Apply global latency
Easy, Not very efficient
Apply based on ICG depth and fanout
Less depth more latency
More fanout more latency
Apply based on CTS results
More accurate
Harish Dangat
15
CE Timing at Clock Tree Synthesis
Clone ICG Cells
set icg_cells { icg_cell_1 icg_cell_2 }
split_clock_net -objects [get_cells $icg_cells] \
-split_intermediate_level_clock_gates -gate_sizing
remove_ideal_network [all_fanout -flat -clock_tree]
remove_propagated_clock *
remove_clock_tree
Harish Dangat
16
ICG Cloning
Harish Dangat
17
CE Timing at Clock Tree Synthesis
Cloning based on fanout and slack
foreach_in_collection CELLS [get_cells * -hier -filter "ref_name =~ *ICG*"] {
set names [get_object_name $CELLS]
set ckPins [get_object_name [get_pins -of_object [get_cells $CELLS] \
-filter "full_name =~ */CLK"]]
set eckPins [get_object_name [get_pins -of_object [get_cells $CELLS] \
-filter "full_name =~ */ENABLE_CLK"]]
set eckFanout [sizeof_collection [all_fanout -from [get_pins $eckPins] -flat]]
set cgSlack [get_attribute [get_pins ${names}/ENABLE] max_slack
if {$cgSlack > -0.150 && $eckFanout > 100} {
echo "${names}/E"
}
remove_propagated_clock *
remove_clock_tree
Harish Dangat
18
CE Timing at Clock Tree Synthesis
Two Pass Flow
Placement
Clone clock tree
Write Verilog
New Placement
Clock Tree
Synthesis
Harish Dangat
19
Agenda
Basics of Clock Gating
Problems Created by Clock Gating
Fixing Clock Enable Timing in RTL-2-GDSII Flow
Results
Conclusion
Harish Dangat
20
Die Temperature Without and
With Clock Gating
Harish Dangat
21
ICG Cells and Flops Autobound
Harish Dangat
22
Comparing Latency Schemes
Path
0
0
100
200
300
400
500
600
700
800
Selective
latency
900
1ns latency
Baseline run
-0.1
CE violation (ns)
-0.2
Series1
Series2
-0.3
Series3
-0.4
-0.5
-0.6
Harish Dangat
23
Results Effect on cloning on latency
1.8
With Cloning
1.6
ICG Clock Latency (ns)
1.4
1.2
1
Series1
Without Cloning
0.8
Series2
0.6
0.4
0.2
0
0
200
400
600
800
Paths (Sorted, low to high)
24
1000
1200
Harish Dangat
Clock Subtree After Cloning
Harish Dangat
25
Comparing Single Pass and Two pass flow
place_opt
clock_opt
26
place_opt
clock_clone
new place_opt
clock_opt
Harish Dangat
Different schemes to minimize latency
Harish Dangat
27
Conclusion
Clock gating is requirement for low-power
design
Closing CE timing requires to pay attention at all
stages of design
By planning at every step, CE timing can be
closed in high-speed low-power designs
Harish Dangat
28
Thank You !
Harish Dangat
29
BACKUP SLIDES
BACKUP SLIDES
Harish Dangat
30
Battery Life is Important
Smartphone power for continuous web access
http://www.phonesreview.co.uk/2012/09/26/iphone-5-vs-samsung-galaxy-s3-battery-life-confrontation/
31
Harish Dangat
How to Minimize Power
Use process designed for low power
Use low power architecture
User power-gating
Use Clock-gating
Harish Dangat
32
Power Saving Opportunity
Clock Gating
Harish Dangat
33
Few Facts About Clock Tree Power
20% to 40% Dynamic power is consumed by
clock tree
About 80% clock tree power is consumed last
stages of clock tree
Ref ISPLED, 2008
Harish Dangat
34
Architectural/Corse Grain Clock Gating
USB_CLOCK
Clock_EN
USB-0
en_usb_0
Control Logic
en_usb_1
Clock_EN
USB-1
Harish Dangat
35
Automated/Fine Grain Clock Gating
Harish Dangat
36
Example of Automated/Fine Grain Clock Gating
Harish Dangat
37
What To Look For In ICG
Too many flops used for
generating CE signal
Large delay in combinational path
Generating flops placed away
from ICG cells
Flops used to generated ICG
signal placed away from each
other
Too man flops receive gated clock
Flops generating
gated clock
Comb cells in
clock gating path
Flops receiving
gated clock
Harish Dangat
38
What To Look For In ICG
Too many flops used for
generating CE signal
Large delay in combinational path
Generating flops placed away
from ICG cells
Flops used to generated ICG
signal placed away from each
other
Too man flops receive gated clock
Flops generating
gated clock
Comb cells in
clock gating path
Flops receiving
gated clock
Harish Dangat
39
Harish Dangat
40
What To Look For In ICG
Too many flops used for
generating CE signal
Large delay in combinational path
Generating flops placed away
from ICG cells
Flops used to generated ICG
signal placed away from each
other
Too man flops receive gated clock
Flops generating
gated clock
Comb cells in
clock gating path
Flops receiving
gated clock
Harish Dangat
41