Clocking Part 2
Clocking
For modern processors, cycle time is around 1620 FO4 delays, of which registers take 2-4 FO4 delays Power consumption dominated by clock load, both distribution network and end loads (latches, prechargers)
70% of total power in IBM POWER4 design
Simple single-edge triggered registers are fine for most ASIC designs. This lecture well examine what is happening in high performance designs.
6.371 Fall 2002
11/6/02
L18 Clocks Part 2
6.371 Fall 2002
11/6/02
L18 Clocks Part 2
Edge Triggered Timing Constraints
TPmin/TPmax Combinational Logic CLK1 CLK2
Two Phase Latch Based Design
Combinational Logic 1 CLK1 CLK1 CLK2 Non-overlap times CLK2 Combinational Logic 2 CLK1
Slow path timing constraint Tcyc TCQmax + TPmax + Tsetup+ Tskew Fast path timing constraint TCQmin + TPmin Thold + Tskew
worst case is when CLK2 is earlier/later than CLK1
Divide cycle into two phases
Fast path constraint cannot be fixed by slowing clock fatal to chip design Skew reduces cycle time
6.371 Fall 2002 11/6/02 L18 Clocks Part 2 3
worst case is when CLK2 is earlier/later than CLK1
Latches driven by two non-overlapping clocks Can guarantee no fast path problems with larger non-overlap
6.371 Fall 2002 11/6/02 L18 Clocks Part 2 4
phase 2 latches can only sample values generated from phase 1 latch outputs, and vice versa.
Two Phase Timing
A CLK1 CLK1 CLK2 A B C D TP1max TDQmax TP2max Tx TNO Ty Combinational Logic 1 B CLK2 TNO Tz TDQmax C Combinational Logic 2 D A CLK1 CLK1 CLK2 A B C D Tx
Time Borrowing
Combinational Logic 1 TNO Tsetup TCQmax TP1max TNO B CLK2 C C.L. D 2
In steady state, Tz Tx, therefore minimum cycle time Tcyc TP1max + TP2max + 2TDQmax Non-overlap time, TNO, can be adjusted such that no hold time violations are possible: TNO + TCQmin - Tskew Thold
6.371 Fall 2002 11/6/02 L18 Clocks Part 2 5
Can place latches where convenient in logic path Maximum time in one combinational logic block is TP1max Tcyc TCQmax Tsetup TNO Tskew
6.371 Fall 2002 11/6/02 L18 Clocks Part 2 6
Single Clock Latch Based Design
Combinational Logic 1 CLK CLK Combinational Logic 2 CLK
Pipelining Domino Logic
Domino circuits require monotonic change in input signal during evaluation phase - cannot easily guarantee this with most edge triggered devices. Transparent latches allow setup of logic inputs before clock edge.
X Q
NMOS
CLK Q X
precharge
eval.
Two phase non-overlapping system requires distribution of two clocks. Can distribute single clock signal, and invert locally at latch. Clock skew can cause overlap between transparent phases of CLK and inverted CLK, so must check for fast path hold time violations. Very common clocking scheme for full custom chips, works well with pipelined domino logic.
CLK
CLK-Q delay discharges precharge node CLK X Q
NMOS
Degraded level eval.
precharge
Q X Q setup before clock edge
11/6/02 L18 Clocks Part 2 8
CLK
6.371 Fall 2002 11/6/02 L18 Clocks Part 2 7 6.371 Fall 2002
Pulse Latches
By using narrow clock pulses, can have only a single latch in any combinational loop. Used in Cray-1, and in many high-performance (Pentium-4) and low-power microprocessors (XScale).
Tw B A TCQmin TPmin TPmax Thold Thold Tsetup CLK A B
Double-Edge Triggered Registers
Clock load of flip-flops is significant fraction of total chip power. Can reduce clock frequency in half by using a double-edge triggered flip-flop.
Combinational Logic
1 Q
CLK
D
B
CLK
A B Q Latch Sample Latch Sample Sample Latch Sample A B A Latch B
Cycle time, Tcyc,min TDQmax + TPmax + Tsetup + Tskew Tw Tw is pulse width, and gives maximum time borrowing for previous cycle Two-sided timing constraint on pulse width
Tsetup < Tw < TCQmin + TPmin - Thold - Tskew
6.371 Fall 2002 11/6/02 L18 Clocks Part 2 9 6.371 Fall 2002
CLK
11/6/02
L18 Clocks Part 2
10
Pentium-4 Pulse Latches
Pentium-4 distributes 50% duty cycle global clock at advertised frequency (e.g., 2.8GHz Pentium-4 has 2.8GHz clock) Fast ALU section of Pentium-4 runs at twice advertised clock frequency using pulse latches driven from both edges of the distributed clock. Clock buffers have duty cycle correction circuitry to ensure 50% duty cycle. GCLK (2.8 GHz) PCLK (5.6 GHz)
Flip-Flops Timing
GCLK
PCLK
[ Stojanovic and Oklobdzija ]
6.371 Fall 2002 11/6/02 L18 Clocks Part 2 11 6.371 Fall 2002 11/6/02 L18 Clocks Part 2 12
Crossing Time Domains
Common to have to communicate between logic blocks running at unrelated clock frequencies
TCLK Clock TCLK Clock Domain Domain TCLK RCLK Possible setup time violation Possible hold time violation RCLK Clock RCLK Clock Domain Domain
D CLK
Metastability
Voltage CLK
metastable
Feedback
D Observation Interval, t Time
TCLK
RCLK
Sampling latch
Probability of failure (i.e., not valid 1 or 0) when observed time t after clock edge - t r F(t) = k e Parameters k and r functions of latch design. r is called the time constant of resolution and is primarily controlled by the gain-bandwidth product of the feedback loop (dont use dynamic latches as synchronizers!). Error probability decreases exponentially with t but always some chance of failure.
13 6.371 Fall 2002 11/6/02 L18 Clocks Part 2 14
If setup and hold times are violated, flip-flops might hang in a metastable state.
6.371 Fall 2002 11/6/02 L18 Clocks Part 2
Metastability Failure Calculations
-t/ r ff = tW fT fC e
Frequency of failures for sampling window (setup+hold) of tW, sampling frequency fC and input transition frequency of fT For 1GHz sampling clock, 100MHz transitions, 50ps setup+hold, 50ps time constant, 950ps observation time ff = 0.03Hz (Mean Time Between Failures: 33 seconds) Increase observation time to 1950ps (two cycles) ff = 5.8x10-11 Hz (MTBF 550 years) Increase observation time to 2950ps (three cycles) ff = 1.2x10-19 Hz (MTBF 266 billion years)
TCLK
Synchronizers
RCLK
Use pipelined registers to give full RCLK cycle to resolve asynchronous input.
TCLK CLKA
CLKB
Use N interleaved registers, each clocked at 1/N of RCLK rate, to increase resolution interval by factor of N without decreasing signal bandwidth.
RCLK
RCLK CLKA CLKB CLKC
CLKC
Rotating Select
Observation Interval
Repeat Interval
6.371 Fall 2002
11/6/02
L18 Clocks Part 2
15
6.371 Fall 2002
11/6/02
L18 Clocks Part 2
16