0% found this document useful (0 votes)
82 views11 pages

Wide Range CMOS ADPLL in 65nm SOI

This document summarizes a research paper that describes an all-digital phase locked loop (ADPLL) fabricated using a 65nm silicon-on-insulator CMOS process. Some key aspects of the ADPLL design include a bang-bang phase/frequency detector, a compact digital loop filter implemented using overflow and underflow of arithmetic units, and a digitally controlled oscillator composed of multiple inverter stages. The ADPLL has a wide tuning range of 500MHz to 8GHz at 1.3V and locks within a range of 90MHz to 1.2GHz at 0.5V. It consumes 8mW/GHz at 1.2V and achieves sub-picosecond jitter.

Uploaded by

陳育楷
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views11 pages

Wide Range CMOS ADPLL in 65nm SOI

This document summarizes a research paper that describes an all-digital phase locked loop (ADPLL) fabricated using a 65nm silicon-on-insulator CMOS process. Some key aspects of the ADPLL design include a bang-bang phase/frequency detector, a compact digital loop filter implemented using overflow and underflow of arithmetic units, and a digitally controlled oscillator composed of multiple inverter stages. The ADPLL has a wide tuning range of 500MHz to 8GHz at 1.3V and locks within a range of 90MHz to 1.2GHz at 0.5V. It consumes 8mW/GHz at 1.2V and achieves sub-picosecond jitter.

Uploaded by

陳育楷
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/2983724

A wide power supply range, wide tuning range, all static CMOS all digital PLL
in 65 nm SOI

Article in IEEE Journal of Solid-State Circuits · February 2008


DOI: 10.1109/JSSC.2007.910966 · Source: IEEE Xplore

CITATIONS READS
189 4,847

3 authors, including:

Jose Tierno Alexander Rylyakov


Apple Inc. Coriant
68 PUBLICATIONS 1,853 CITATIONS 193 PUBLICATIONS 5,331 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Alexander Rylyakov on 14 August 2014.

The user has requested enhancement of the downloaded file.


42 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 43, NO. 1, JANUARY 2008

A Wide Power Supply Range, Wide Tuning Range,


All Static CMOS All Digital PLL in 65 nm SOI
José A. Tierno, Alexander V. Rylyakov, Member, IEEE, and Daniel J. Friedman, Member, IEEE

Abstract—An all static CMOS ADPLL fabricated in 65 nm In this paper, we describe a PLL realization that is digital,
digital CMOS SOI technology has a fully programmable propor- thus avoiding the issues associated with analog PLLs intended
tional-integral-differential (PID) loop filter and features a third for use in predominantly digital chips. Because the digital prop-
order delta sigma modulator. The DCO is a three stage, static in-
verter based ring oscillator programmable in 768 frequency steps. erties of the technology’s underlying devices tend to degrade
The ADPLL lock range is 500 MHz to 8 GHz at 1.3 V and 25 C, more gracefully with reduced supply voltage than do the crit-
and 90 MHz to 1.2 GHz at 0.5 V and 100 C. The IC dissipates ical analog properties, this PLL may also be well-suited to ap-
8 mW/GHz at 1.2 V and 1.6 mW/GHz at 0.5 V. The synthesized plications demanding ultra-low supply voltages. Key elements
4 GHz clock has a period jitter of 0.7 ps rms, and long term jitter of the design include a bang-bang phase/frequency detector, the
of 6 ps rms. The phase noise under nominal operating conditions
is 112 dBc/Hz measured at a 10 MHz offset from a 4 GHz center use of underflows and overflows from the filter arithmetic units
frequency. The total circuit area is 200 m 150 m. to enable a compact filter/DCO control implementation, and a
ring oscillator comprised of multiple digitally controlled units.
Index Terms—Bang-bang phase and frequency detectors, digital
phase locked loops, phase locked loops. The paper is organized as follows. Section II describes the ar-
chitecture and schematic-level design of the DPLL. Section III
describes the physical design of the circuit. In Section IV, hard-
I. INTRODUCTION ware measurements of the DPLL operating in various modes
and supply voltage domains are presented. Section V presents a
summary and conclusions regarding this work.
C OMPLEX digital circuits such as microprocessors typi-
cally require support circuitry that has traditionally been
realized using analog or mixed-signal macros. A critical ex- II. ARCHITECTURE
ample of such a support circuit is the phase locked loop (PLL). Many features of digital PLLs described in the literature to
The realization of the PLL using a traditional analog architec- date are not critical for the requirements of clock generation sys-
ture places demands on the underlying process technology that tems for large-scale digital circuits such as microprocessors or
are quite different from those driven by high speed logic require- ASICs. When well-controlled bandwidth is required as for wire-
ments. Analog PLLs typically require elements not used in stan- less applications, a multi-bit time-to-digital converter (TDC) is
dard logic, including resistors and low leakage capacitors, and often used [2], [11]. If short lock times are required, an ex-
rely on properties not critical to standard logic circuits, such plicit digital-to-analog converter (DAC) and a binary to ther-
as matching and output impedance uniformity. Furthermore, mometer encoder may be implemented [4], [14]. In particular,
analog PLLs may use logic families other than static CMOS, a short lock time can be achieved by the use of a specialized
such as current mode logic (CML). As process technologies ad- digital search algorithm and a control scheme that enables di-
vance and grow in complexity, the challenge of maintaining re- rect manipulation of the DAC. Fast lock can also be achieved
quired analog elements and performance for use in circuits such using a dual-loop architecture (one loop for aquiring frequency,
as PLLs grows. Furthermore, because the analog PLL uses ele- one loop for aquiring phase), with different filter characteris-
ments, device properties, and logic families unlike those of the tics supporting the phase and frequency acquisition loops, re-
chip’s digital core, its yield and performance spreads may not spectively [4]. These architectural choices significantly increase
be well-correlated to those of the digital portions of the design. complexity, area, and power without providing commensurate
A number of digital PLLs have been described in the liter- benefit for the clocking applications targeted by the work de-
ature [1]–[3], with target application spaces ranging from mi- scribed here.
croprocessors to cellular telephone chipsets. Key DPLL design The proposed ADPLL is intended for use in large scale dig-
issues include how the phase error between reference and feed- ital chip clock generation applications. In these applications, the
back clock is quantified, the structure of the loop filter, whether critical specification is peak to peak period jitter, with a sec-
an analog (via digital-to-analog conversion of the filter output) ondary requirement that spread spectrum clocking be supported.
or direct digital control of the VCO is used, and the choice of The realized ADPLL uses single-loop architecture, based on a
oscillator topology. self-timed, bang-bang phase and frequency detector (BB-PFD).
Note that the use of a BB-PFD is acceptable in this design be-
cause of the relaxed bandwidth and noise specifications of the
Manuscript received June 17, 2007; revised September 10, 2007. target application; this design point does not require a multi-bit
The authors are with the IBM Thomas J. Watson Research Center, Yorktown
Heights, NY 10598 USA (e-mail: tierno@us.ibm.com). TDC. The ADPLL also does not have an explicit DAC or bi-
Digital Object Identifier 10.1109/JSSC.2007.910966 nary to thermometer encoder, instead relying on direct digital
0018-9200/$25.00 © 2008 IEEE
TIERNO et al.: A WIDE POWER SUPPLY RANGE, WIDE TUNING RANGE, ALL STATIC CMOS ALL DIGITAL PLL IN 65 nm SOI 43

Fig. 1. Top-level diagram of the ADPLL.

to frequency conversion in the DCO through the use of tri-


state inverters and implicit binary to thermometer conversion via
shifts in DCO row and column controls. Finally, the proposed
ADPLL’s loop filter operates exclusively on a subset of the bits
that define the frequency control word, thus reducing the filter’s
size and power. The key new features of the proposed ADPLL
architecture are: a self-timed, bang-bang phase and frequency Fig. 2. DCO formed by pairing a DAC and a VCO.
detector; a loop filter operating on only a portion of the bits
defining the frequency control word; and a DCO implemented
with an array of tri-state inverters.
Fig. 1 shows the top level diagram of the ADPLL. A bang-
bang phase and frequency detector (PFD) is used to compare
the arrival times of the reference and divided clock edges. The
resulting early/late information is filtered through a digital pro-
portional-differential-integral filter operating at the divided fre-
quency. The output of the filter is divided into most significant Fig. 3. DCO formed by digitally controlling a physical parameter of an
bits (MSB), which are sent directly to the digitally controlled oscillator.
oscillator (DCO), and the least significant bits (LSB), which are
sent to a third-order sigma-delta modulator and are used to en-
hance the frequency resolution of the DCO. The output of the tage in that it enables the immediate use of well known and un-
DCO is divided down in two stages. The first stage is used to derstood VCO and DAC structures with minimal redesign. The
generate a local clock that goes to all of the digital logic, while key drawback of the DAC-plus-VCO oscillator is that both of
the second stage is used to generate a clock gating signal that these components introduce strong performance dependencies
further reduces the effective clock frequency, and is used by the on analog behavior of both components, limiting a key benefit
loop filter and the PFD. A duty cycle correction buffer (DCC) of implementing a digital PLL architecture. The DAC based de-
at the output of the DCO is used to compensate for asymmetries sign thus demands accurate analog models for the technology,
of the DCO output drivers. always a difficult challenge to meet, and typically impossible
early in the technology cycle. Furthermore, the analog depen-
A. Digitally Controlled Oscillator dence of this design style makes portability to newer or different
technologies is less direct, leading to extensive manual interven-
Without a doubt, the most critical component of the ADPLL tion if a remap is required.
is the digitally controlled oscillator (DCO). If the noise perfor- The second approach, shown in Fig. 3, is to digitize the mech-
mance of the free-running oscillator is extremely poor, it will anism for varying the frequency. In an LC-tank based VCO, for
not be possible to realize a PLL meeting any reasonable perfor- example, the tank varactor would be broken into a number of
mance specifications using that oscillator. Also, tuning range smaller varactors, and the capacitance of the tank adjusted by
and the mechanism for tuning the oscillator directly affects its fully turning on or off these varactors one at a time [2], [6].
usefulness in ASIC and microprocessor applications. In an inverter-ring oscillator based DCO as used in the work
Two main approaches to realizing the DCO are taken in the described here, the inverters that comprise the ring are divided
literature. The first one, shown in Fig. 2, is a hybrid, where a tra- into addressable component structures and the effective strength
ditional VCO is controlled digitally by a digital to analog con- of the composite inverters adjusted by increasing or decreasing
verter (DAC) [4], [5]. This approach offers a significant advan- the number of enabled transistors that form each stage of the
44 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 43, NO. 1, JANUARY 2008

Fig. 4. Tri-state inverter based DCO.

ring. As a drawback, this approach demands a much larger os- The first 16 rows of the inverter array are turned on/off by
cillator than that of the hybrid approach. On the other hand, the a row/column pseudo-thermometer control (PTC) [1], [7]. On
resulting ring oscillator is still smaller than the DAC required in the periphery of the array, a set of column control latches deter-
the first hybrid approach, and instead of spending power on an mines which inverters in the partially turned on row are on, and
accurate, high-resolution DAC, power is spent on the ring oscil- which are off. A set of row control latches divides the rows into
lator, which has the side benefit of yielding improved open-loop fully turned on (if the row above the current row is also fully
phase noise characteristics. or partially turned on), partially turned on (if the current row is
Fig. 4 shows the structure of the DCO used in the DPLL im- turned on, and the row above it is turned off), and fully turned
plementation described here. Each stage of this three-stage ring off. When the loop filter requests an increase in the output fre-
oscillator comprises 271 tri-state CMOS inverters connected in quency, the column control latches are shifted, turning on one
parallel, yielding a total of 813 inverters. As more inverters in more inverter in the partially turned on row as long as such an
each stage are turned on by the control blocks of the DCO, the inverter is available. When the current row is fully turned on, the
current driving strength of the stage increases while its capac- row control is shifted, so that the next available row becomes the
itive load remains essentially constant, resulting in an increase partially turned on row. Likewise, when the loop filter requests
in the output frequency. that the frequency be reduced, the column and row controls are
The tri-state inverters are arranged in an array of 17 rows by shifted in the reverse direction. To avoid having to flip all of the
48 columns, yielding a total of 816 inverters. Each column is latches in the column control every time a change in which row
assigned to a single phase of the ring oscillator, with adjacent is partially on occurs, adjacent rows are controlled by opposite
columns assigned to adjacent phases. For example, columns 0, polarities of the column control. When neither a decrease nor
3, 6, are assigned to the first phase, and columns 1, 4, 7, an increase are requested, a clock gating signal is generated to
are assigned to the second phase, etc. Inverters are turned on one reduce the power dissipation (and potentially noise) generated
at a time by rows, creating over 768 discrete frequency steps. by the row/column control.
Note that not all inverters can be on at the same time, 36 inverters Part of the top row of the DCO is directly controlled by the
are set to be always on, and three inverters are dedicated drivers dithering signals coming from the sigma delta modulator, and
for the oscillator output. another part of that row by the latency bypass signals coming
TIERNO et al.: A WIDE POWER SUPPLY RANGE, WIDE TUNING RANGE, ALL STATIC CMOS ALL DIGITAL PLL IN 65 nm SOI 45

Fig. 6. Third order MASH sigma delta modulator.


Fig. 5. Loop filter block diagram.

from the loop filter (UnderflowP and OverflowP in Fig. 5). An- The proportional-differential section of the loop filter adds a
other three inverters in the top row are used to drive the output, proportional and differential path to the output of the accumu-
one for each phase to preserve load symmetry among the phases. lator, still with five bit arithmetic. If there is an overflow or un-
The rest of this row (36 inverters) is permanently turned on to derflow from this operation, these signals are used to turn on (or
provide reliable startup and operation when none or very few of off) one of the dithering inverters in the DCO, thus affecting the
the inverters in the remaining rows are turned on. frequency of the oscillator for the current reference clock cycle
The frequency dynamic range of the DCO is roughly 16-to-1. only. These signals are not accumulated by the DCO control.
Because the inverter sizes are uniform in this implementation, The proportion-differential control is applied after the integra-
however, the frequency steps are relatively large in the bottom tion takes place, acting to lower the latency of the proportional
quarter of the range, implying that operating performance at path to the oscillator. The quantity obtained by adding the output
low fill factors (fill factor is the ratio of inverters that are turned of the integrator and that of the proportional-differential section
on to the total number of inverters) will be degraded. With represents the fraction of an inverter that should be enabled; we
this constraint in mind, the effective dynamic range (the range will refer to this quantity as the fractional frequency, as it en-
over which acceptable PLL performance can be obtained) of codes a s step size that is a fraction of a minimum DCO discrete
the DCO is approximately 4-to-1, sufficient to accommodate step (one inverter). This signal is passed on to the sigma-delta
process, temperature, and supply related variations in DCO modulator which converts the target fractional value into con-
center frequency; therefore, this design does not include sep- trols for the rest of the dithering inputs of the DCO.
arate frequency band controls. The integral, proportional and differential constants of the
filter are encoded using four bits each, which limits the ratio
of the largest proportional constant to the integral constant to
B. Loop Filter 15 to 1. Depending on the PLL’s desired frequency multiplica-
tion factor, this 15 to 1 ratio may not be enough; clearly, larger
The loop filter, shown in Fig. 5, is a programmable, discrete
constants could be implemented in revised versions of such a
time proportional integral differential (PID) filter that operates
design.
at the divided output clock frequency. When the ADPLL is
locked, the loop filter operates at the same frequency as the ref-
C. Sigma Delta Modulator
erence clock. In lock, an output is computed for every reference
cycle. All operations are performed using five bits of resolution. The sigma delta modulator was implemented using a pro-
Underflows and overflows are passed to the DCO control for fur- grammable, third order, MASH architecture [1], [9], shown in
ther accumulation. Fig. 6. It can be configured to operate as a first, second, or third
The integral section of the loop filter accumulates the error order element, or can be altogether turned off by using clock
coming from the PFD multiplied by a programmable integration gating on each of the sections of the modulator. This sigma delta
constant. When this five bit accumulator overflows or under- modulator can be run at either one fourth or one eight of the
flows, a corresponding signal is asserted, and the DCO control output frequency. As in the case of the loop filter, all arithmetic
increases or decreases the output frequency of the oscillator. The is done using five bits of resolution. The sigma delta modulator
DCO control effectively implements the most significant bits of is used to encode the fractional frequency generated by the loop
the accumulator, for a total of equiva- filter into dithering signals for the DCO, effectively increasing
lent bits. The integration is performed in two pipelined adders, its frequency resolution. Higher order sigma delta modulators
the first of which is explicitly realized in the loop filter using a are used to push the residual quantization noise into higher fre-
binary representation, and the second of which is implicitly re- quencies, which are then rejected by the low pass characteristic
alized in the DCO using a pseudo-thermometer code. of the ADPLL closed loop transfer function.
46 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 43, NO. 1, JANUARY 2008

Fig. 8. Mutual exclusion element with metastability filter.

Fig. 7. Self-timed, bang-bang phase and frequency detector.

The sigma delta modulator operates at the pre-scaled clock


frequency (the divide-by-N clock shown in Fig. 1) and oversam-
ples the output of the loop filter by the clock division ratio M (as
shown in Fig. 1). The use of phase holds to implement the clock
division facilitates the handoff of data from the divided clock to
the pre-scaled clock, since all latches operate on the same clock
edge.
The sigma delta modulator generates seven bits: carry out Fig. 9. VHDL simulation of frequency capture and phase lock for the closed
for the first order, differentiated carry out for the second order loop PLL.
(two bits), and double differentiated carry out for the third order
(four bits). These seven bits are directly connected to the tri-state
control inputs of seven inverters in the DCO. transitions to “1” and the corresponding output of the metasta-
bility filter transitions to “0.” The mutex is now in a locked
D. Phase and Frequency Detector state, and the late arriving transition is ignored. When both
Fig. 7 shows the self-timed, bang-bang phase and frequency transitions arrive at roughly the same time, the cross-coupled
detector used in the ADPLL. The two input latches are used nor-gates may go into a metastable state, generating a poorly
to detect the arrival of an edge on the reference and feedback defined output level. The metastability filter will not propagate
clocks, respectively. A mutual exclusion element determines signals to the output until the two outputs of the nor-gates
which of the two edges arrived first, and stores the result in a differ by at least an NFET threshold voltage. Once this occurs,
set-reset flip flop. A self timed reset loop determines that all the metastable state is extinguished, and a valid logic signal is
events have taken place, and generates a reset pulse that pre- provided to the subsequent latch. The metastability filter thus
pares the PFD for future edges of the reference and feedback prevents the propagation of not fully regenerated signals.
clocks. The output of the mutex sets a set-reset flip-flop, thus gener-
The proposed BB-PFD has an infinite dynamic range in terms ating a completion signal at the end of the process. The com-
of phase difference between the inputs. Unlike a linear PFD as pletion signal indicates that the value that the mutex is trying
would be used in a typical analog PLL, the proposed BB-PFD to write in the flip-flop agrees with the value in the flip-flop, at
output remains either high or low for an entire reference clock which point the write operation has succeeded. This completion
cycle, not for a time proportional to the phase difference be- signal, together with the output of both edge detectors, is col-
tween the feedback and reference clocks. The BB-PFD is a state lected in a Muller C-element [13]. When all three signals are
machine that, having detected an edge at one input, waits for high, the C-element will transition to a high state, and generate
a transition on the second input, ignoring any additional edges a reset pulse that will clear the edge detector latches. As a re-
that might arrive at the first input. The transfer function of the sult, all of the signals in the PFD will be reset to their default
BB-PFD does not have periodicity, instead providing abso- state, and the C-element will transition back to low, allowing
lute lead-lag information at its output for all phase differences processing of the next set of input edges.
appearing at its inputs. As a result, the transition dynamics are This detector works both as a frequency detector and a phase
very smooth and cycle slipping behavior is suppressed, as can detector as is shown in the simulated capture waveform of Fig. 9.
be seen in Fig. 9. During the frequency capture period, the PFD output indicates
The mutual exclusion element (mutex), shown in Fig. 8 [10], whether the reference or feedback clock frequency is higher.
consists of two cross-coupled nor-gates, followed by a metasta- Once the two frequencies are sufficiently close, the PFD indi-
bility filter. The inputs to the nor-gates are normally high, cates, with some amount of delay, the leading phase. The delay
forcing the outputs of the nor-gates to “00”, and the output is due to the time it takes for the fast edge to overtake the slow
of the metastability filter to “11.” When one of the inputs edge. During this period the PLL is trying to capture the right
transitions from “1” to “0,” the corresponding nor-gate output phase, and from then on the PLL is locked. Note that configuring
TIERNO et al.: A WIDE POWER SUPPLY RANGE, WIDE TUNING RANGE, ALL STATIC CMOS ALL DIGITAL PLL IN 65 nm SOI 47

Fig. 10. Clock divider timing diagram.

the ADPLL with a larger integration constant makes the fre- Fig. 11. Floorplan of the ADPLL in 65 nm SOI. Layers through M3 shown.
quency capture time shorter, but makes the phase capture time
longer [12].
6 tri-state inverters and the corresponding control circuitry was
E. Clock Divider
used as a tile and replicated 128 times, with most connections re-
Clock division is performed in two steps. The first step is a sulting from the abutment of the tiles. The metastability filter for
straight clock division implemented using latches in a toggling the PFD also required custom layout, since it was not available
configuration. This division step pre-scales the clock either by in the standard cell library. The clock pre-scaler used a custom
four or by eight, and is used to provide a clock that will be slow divider. Custom I/O blocks were also implemented to enable
enough for the loop logic to perform properly. testing, and these blocks are not CMOS, but CML drivers and
The second division step uses the pre-scaled clock to generate receivers for the high speed signals. These last blocks were in-
a “phase hold” signal. This signal is de-asserted one pre-scaled cluded only for test, thus were not included in area and power
clock out of , where is a number between one and eight. The dissipation assessments given below.
phase hold signal is used to gate the pre-scaled clock going to Even though the logic is, in principle, synthesizable, most of
the loop filter and the PFD, effectively creating a slower clock it was designed using schematic entry. This custom logic design
. The timing relationship among these signals is took longer than a synthesis path would have taken, but it greatly
shown in Fig. 10. facilitated the task of placing the standard cells around the DCO
The main advantage of the phase hold technique is to allow array. After custom placement, circuits were auto-routed, except
data to move cleanly between the loop filter and the sigma delta in cases where symmetry had to be maintained (as in the PFD,
modulator as both sending and receiving latches operate on the for example).
same clock edge. The data out of the loop filter is updated on the Two versions of this circuit were realized. The first version
positive edge of the pre-scaled clock every time that the phase uses exclusively high threshold voltage (HVT) transistors, and
hold is de-asserted, but it is used by the sigma delta modulator is a more conservative design. The second version uses exclu-
on every positive edge of the pre-scaled clock. sively regular threshold voltage (RVT) transistors, and has better
performance in the high end of the frequency range, at the ex-
F. Duty Cycle Correction Buffer pense of more leakage power. Both designs were fabricated and
The inverters in the DCO array were sized to have same fall tested.
and rise time (to obtain fairly symmetric waveforms), and thus Fig. 11 shows a plot of the final layout, excluding the I/O
minimize the impulse sensitivity function of the oscillator [8]. blocks and three-wire interface used to program the ADPLL in
Process variations can affect this cycle time, and so can the clock its various modes. The total area of the ADPLL is m
distribution network of a digital circuit. It is customary to add m.
duty cycle correction buffers to fine tune the duty cycle at the
latch clock input, so as to maximize the available cycle time. IV. HARDWARE MEASUREMENTS
The duty cycle correction buffer (DCC) that we implemented Fabricated hardware was tested by probing wafers or bare
here consists of a CMOS buffer where the strength of both die, enabling extensive measurement and characterization of the
NFET and PFET can be controlled with a 4-bit binary input. relevant parameters of the ADPLL, including tuning range, pe-
This buffer is connected to the output drivers through high gain riod jitter, accumulated jitter, and phase noise. All results were
single-ended to differential converters, to compensate for the checked against the design model in simulation, to help in the
loss in rise and fall time that occurs when either the PFET or interpretation of hardware results. The digital nature of the cir-
NFET of the buffer is reduced in strength. cuit greatly helped in making the model to hardware correlation
very tight.
III. PHYSICAL DESIGN
This design was implemented using exclusively CMOS gates, A. Tuning Range
almost all of them from a standard cell library. The tri-state in- The tuning curve was measured by setting the latches that
verters of the DCO were manually laid out to take advantage of control the DCO to specific values, and then recording the
the high regularity of the DCO array. A basic layout containing output frequency. This operation was repeated at various power
48 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 43, NO. 1, JANUARY 2008

Fig. 12. Measured tuning curves for the DCO, HVT devices, at 100 C.

Fig. 14. Measured tuning curves for the DCO at fixed ivdda current, for RVT
Fig. 13. Measured tuning curves for the DCO at near threshold voltage power devices at 100 C.
supply, for RVT devices at 25 C.

supply and temperature settings, to obtain a family of tuning


curves, partially shown in Fig. 12.
Under nominal conditions (HVT transistors, 100 C, analog
power supply ), the ADPLL exhibited a tuning
range of 300 MHz to 5 GHz, in line with the expected 16:1
tuning range ratio. To obtain the high end of the tuning range
curves, RVT parts were evaluated with , at 25 C.
In this case, the maximum frequency exceeded 8 GHz when all
inverters in the DCO were turned on.
Fig. 13 shows tuning curves for values of Vdda around the
threshold voltage for the transistors in the circuit. We can see Fig. 15. DCO gain as a function of the output frequency for regulated voltage
that even for , the tuning curve is still linear and and regulated current supply, RVT transistors.
monotonic. The tuning range, here going up to 25 MHz, is still
very interesting and potentially useful for extreme low-power
applications (total power dissipation in this regime was under Because the current available to charge the output capacitor of
10 ). the inverters in the ring oscillator does not change, we get only a
Measuring the tuning curve at , however, limited frequency gain by turning on one more tri-state inverter
reveals significant non-linearity in the DCO response. At this in the DCO. The loss of gain in the DCO can be compensated
low voltage, process variation, and the exponential nature of with larger constants (and therefore larger gain) in the loop filter,
the inverter output current in subthreshold regime, makes some while, at the same time, a lower quantization noise in the DCO is
of the inverters in the array contribute very little to the output obtained. The additional benefit of this scheme is that the current
frequency. source acts as a power supply regulator, improving power supply
Fig. 14 shows the tuning curves of the DCO where the power noise rejection and hence reducing supply noise-induced PLL
supply is configured as a regulated current source. The goal is jitter. The DCO gain compression is shown in Fig. 15, where
to reduce the small signal gain of the DCO by creating a smaller the DCO gain for is compared to the DCO gain
frequency step when the input to the DCO changes by one LSB. for mA. At a 4 GHz output frequency, the current-
TIERNO et al.: A WIDE POWER SUPPLY RANGE, WIDE TUNING RANGE, ALL STATIC CMOS ALL DIGITAL PLL IN 65 nm SOI 49

TABLE I
JITTER HISTOGRAM PARAMETERS FOR VARIOUS OPERATING POINTS

Fig. 16. Measured closed loop phase noise plot for the ADPLL, 4 GHz output,
Integral = 0 0625 proportional = 0 5
500 MHz reference, constants: : , : ,
di erential = 0 4375 Vdda = 1 2 V Vdd = 1 2 V
: ; : , : , 100 C.

Fig. 18. Period histogram 4 GHz output, 0.5 GHz ref., 1.2 V supply, 100 C,
2nd order61 .

Applying the third order sigma delta does not seem to have
much of an impact on the performance of the system, either
from a phase noise or total jitter point of view.

C. Period Jitter
Period jitter is defined as the variations of the period of the
oscillator with respect to a fixed, nominal period. This concept
is useful to consider in particular to determine the usable part of
the clock period in a digital circuit. It allows the determination
Fig. 17. Simulated phase noise plot for the locked ADPLL, 4 GHz output, of the minimum value that a clock period can be expected to
500 MHz reference.
have, and this value is used in performing timing analysis of the
digital circuit.
The period jitter of the ADPLL was measured using a real
source supply DCO has gain (and quantization noise amplitude) time, 12 GHz bandwidth, 40 GSample/s oscilloscope. This os-
half that of the voltage-source supply DCO. cilloscope directly measures the clock period on a cycle by cycle
basis, and plots a histogram of the period values. The scope
B. Phase Noise
also provides other parameters, like minimum and maximum
Fig. 16 shows the measured phase noise for the closed loop measured period, and the standard deviation of the period, also
ADPLL. The 0th order plot corresponds to the phase noise with called the RMS value of the period jitter. This section presents
the sigma delta modulator turned off. We measure 101 dBc/Hz period jitter measurements for three different oprating condi-
at a 1 MHz offset from a 4 GHz center frequency. A 7 dBc/Hz tions; the results are summarized in Table I.
peak can be observed at 7 MHz offset. This peak corresponds to Fig. 18 shows the period histogram of the ADPLL operating
the noise generated by a low frequency limit cycle in the opera- at a 1.2 V power supply, 100 C, 4 GHz output frequency,
tion of the ADPLL, in part arising from the use of a bang-bang 500 MHz reference frequency, with the second order sigma
phase detector. The existence of a limit cycle, and its effect on delta enabled. This operating point corresponds to what might
the phase noise plot, has been confirmed in simulation, as shown be required for high performance applications such as, for
in Fig. 17. When we turn on the sigma delta modulator, the limit example, high end microprocessors. In this case, the measured
cycle is moved, in attenuated form, to higher frequencies. One RMS period jitter of 0.7 ps is at the limit of what can be
reason why this phase noise peak cannot be reduced further is discriminated by our equipment and test procedure.
the limited accuracy of the sigma delta (five bits). A higher res- Fig. 19 shows the period histogram of the ADPLL oper-
olution sigma delta may help to reduce the peak by creating a ating at 0.5 V power supply, 100 C, 1 GHz output frequency,
richer set of possible states in the PLL. 125 MHz reference frequency, with no sigma delta enabled.
First and second order sigma delta modulators reduce This operating point corresponds to performance that would
the overall ADPLL phase noise, to 107 dBc/Hz and be appropriate for some typical ASIC applications; the period
112 dBc/Hz, respectively, at a 1 MHz offset from 4 GHz. jitter is 3 ps.
50 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 43, NO. 1, JANUARY 2008

Fig. 19. Period histogram 1 GHz output, 125 MHz reference, 0.5 V supply,
100 C, no 61 .
Fig. 21. N-cycle jitter accumulation, 1.2 V power supply, 100 C, 4 GHz output
frequency for 125, 250, and 500 MHz reference clock, 2nd order 61 .

TABLE II
POWER DISSIPATION NUMBERS FOR THE ADPLL WITH RVT DEVICES, 25 C

sured output duty cycle with no correction applied was 50.1%.


By stepping through the 16 settings of the duty cycle correc-
tion circuit, we were able to generate output duty cycles ranging
Fig. 20. Period histogram 500 MHz output, 125 MHz reference, 0.4 V supply, from 42% to 61%. Ultimately, this circuit can be used as part of
61
first order .
a closed-loop duty cycle correction scheme (not implemented
as part of this design).
Fig. 20 shows the period histogram for the RVT device-based E. Power Dissipation
ADPLL operating at 0.4 V power supply, 25 C, 500 MHz
output frequency, 125 MHz reference frequency, with the first Table II summarizes the power dissipation measurements
order sigma delta enabled. In this case the ADPLL is still for the ADPLL with RVT devices at 25 C. At a 4 GHz,
locked to the input reference, but the bimodal distribution of high performance operating point, the digital
the period jitter may indicate that not all of the digital control control power dissipation is 6 mW, and the DCO power dissi-
circuitry is operating properly. pation is 11.2 mW, for a total of 17.2 mW.
N-cycle jitter is defined as the variation of the difference of For an ASIC application, where the speed requirements are
the duration of N consecutive cycles and the following N con- much lower, we can run the ADPLL at a 0.5 V power supply,
secutive cycles. For , N-cycle jitter becomes cycle to cycle which could be derived from a standard 1.0 V input via a regu-
jitter. N-cycle jitter was measured for the closed loop ADPLL; lator (thus avoiding requiring an extra chip-level power supply
results are presented in Fig. 21 for various operating conditions. input). At 1 GHz operation in this configuration, the digital con-
For small values of N, variation is occurring too quickly to be trol power dissipation is 0.75 mW and the DCO power dissipa-
compensated by the closed-loop behavior of the ADPLL and tion is 0.9 mW, for a total of 1.65 mW. The plain CMOS nature
thus its N-cycle jitter characteristics are similar to those of an of the ADPLL allows us to effectively trade off power dissipa-
unlocked DCO. For larger values of N, as the loop has enough tion for performance, a characteristic difficult to achieve in PLL
time to start correcting for phase wander, the ADPLL starts implementations with more analog content.
tracking the long term behavior of the reference, which is a high
quality source. The point at which the N-cycle curves start flat- V. CONCLUSION
tening roughly corresponds to the latency, in output cycles, of We describe an all digital PLL built exclusively with digital
the PFD, loop filter, and DCO; this latency is, of course, depen- static CMOS gates, either from a standard cell library, or custom
dent on the clock division ratio. designed (PFD, DCO core, clock pre-scaler). The architecture of
the ADPLL has several novel features: a self-timed, bang-bang
D. Duty Cycle Correction phase and frequency detector that uses completion signals and
Duty cycle correction buffers were tested on an RVT sample does not depend on timing for its operation; a loop filter that op-
at 100 C, , 4 GHz output frequency. The mea- erates exclusively on the fractional part of the frequency control
TIERNO et al.: A WIDE POWER SUPPLY RANGE, WIDE TUNING RANGE, ALL STATIC CMOS ALL DIGITAL PLL IN 65 nm SOI 51

word, thus reducing the size and power of the filter; and a DCO [6] D.-H. Oh, D.-S. Kim, S. Kim, D.-K. Jeong, and W. Kim, “A 2.8 Gb/s
implemented with an array of tri-state inverters that works reli- all-digital CDR with a 10 b monotonic DCO,” in IEEE ISSCC Dig.
Tech. Papers, Feb. 2007, pp. 222–223.
ably over a wide range of process/voltage/temperature variation. [7] N. Da Dalt, C. Kropf, M. Burian, T. Hartig, and H. Elu, “A 10 b 10 GHz
The combination of a self-timed phase detector and the architec- digitally controlled LC oscillator in 65 nm CMOS,” in IEEE ISSCC
ture of the DCO enabled the ADPLL to operate correctly over Dig. Tech. Papers, Feb. 2006, pp. 188–189.
[8] A. Hajimiri, S. Limotyrakis, and T. H. Lee, “Jitter and phase noise
a wide range of supply voltages. in ring oscillators,” IEEE J. Solid -State Circuits, vol. 34, no. 6, pp.
The ADPLL occupies an area of 200 m by 150 m in a 790–804, Jun. 1999.
65 nm SOI CMOS process. This compact realization of the PLL [9] B. Miller and R. J. Conley, “A multiple modulator fractional divider,”
IEEE Trans. Instrum. Meas., vol. 40, no. 3, pp. 578–583, Jun. 1991.
function is a direct result of the architectural choices embodied [10] A. J. Martin, Programming in VLSI: From Communicating Processes
in the design, as well as the use of static CMOS throughout. The to Delay-Insensitive Circuits. Reading, MA: Addison-Wesley, 1991.
all-digital architecture of the PLL guarantees that the design’s [11] J. Sonntag and J. Stonick, “A digital clock and data recovery architec-
ture for multi-gigabit/s binary links,” IEEE J. Solid-State Circuits, vol.
area will naturally scale with technology. 41, no. 8, pp. 1867–1875, Aug. 2006.
Two implementations of the ADPLL were realized, one with [12] J. Lee, K. S. Kundert, and B. Razavi, “Analysis and modeling of bang-
HVT devices and one with RVT devices, where one was ob- bang clock and data recovery circuits,” IEEE J. Solid-State Circuits,
vol. 39, no. 9, pp. 1571–1580, Sep. 2004.
tained from the other by blind substitution of the mask geom- [13] D. E. Muller and W. S. Bartky, “A theory of asynchronous circuits,” in
etry that selects the device type. These designs were tested and Proc. Int. Symp. Theory of Switching, Apr. 1959, pp. 204–243, Harvard
demonstrated fully operational; the success of both of these ver- Univ. Press.
[14] R. Staszewski and P. Balsara, “All-digital PLL with ultra fast settling,”
sions suggests that the architecture is also very resilient to global IEEE Trans. Circuits Syst. II, Express Briefs, vol. 54, no. 2, pp.
process variations. 181–185, Feb. 2007.
The ADPLL was fully functional at supply voltages ranging
José A. Tierno received the Engineering degree
from 0.5 V to 1.3 V, with tuning ranges of 90 MHz to 1.25 GHz from the Universidad de la República, Montevideo,
and 500 MHz to 8 GHz, respectively. The wide range of usable Uruguay, in 1988 He received the M.S. degree
supply voltages and frequencies enables automatic tracking in electrical engineering in 1989 and the Ph.D.
degree in computer science in 1995, both from the
of process, voltage, and temperature variation and also en- California Institute of Technology, Pasadena.
ables support of applications requiring dynamic voltage and Since 1995, he has been working at the IBM
frequency scaling. T. J. Watson Research Center, Yorktown Heights,
NY, in the area of digital circuits for communica-
The measured phase noise was in line with simulation results, tions. His main areas of interest are self-timed digital
with a typical value for the RVT design being 112 dBc/Hz @ circuits, and digital replacement of analog circuits.
4 GHz output frequency, 1 MHz offset. Period jitter for this de-
sign, the metric most directly relevant to our intended applica-
tion, was measured to be 0.7 ps RMS for a 4 GHz output fre- Alexander V. Rylyakov (M’07) received the M.S.
quency. Power consumption was measured to be 17.2 mW from degree in physics from the Moscow Institute of
a 0.9 V supply at a 4 GHz output frequency. Physics and Technology, Moscow, Russia, in 1989,
and the Ph.D. degree in physics from the State
University of New York at Stony Brook in 1997.
ACKNOWLEDGMENT From 1994 to 1999, he worked in the Department
The authors would like to thank G. English for physical de- of Physics at SUNY Stony Brook on the design and
testing of integrated circuits based on Josephson
sign, M. Meghelli and S. Rylov for technical insights and fruitful junctions. In 1999, he joined IBM T. J. Watson Re-
discussions, and D. Kuchta, P. Muench, G. Smith, R. Dussault, search Center, Yorktown Heights, NY, as a Research
S. Gowda, M. Soyuer, and M. Oprysko for support at various Staff Member. His main current research interests
are in the areas of digital phase-locked loops and integrated circuits for wireline
stages of this work. and optical communication.

REFERENCES
[1] R. B. Staszewski, D. Leipold, K. Muhammad, and P. T. Balsara,
“Digitally controlled oscillator (DCO)-based architecture for RF Daniel J. Friedman (S’91–M’92) received the Ph.D.
frequency synthesis in a deep-sub-micrometer CMOS process,” IEEE degree in engineering science from Harvard Univer-
Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 50, no. 11, sity, Cambridge, MA, in 1992.
pp. 815–828, Nov. 2003. After completing consulting work at MIT Lincoln
[2] R. B. Staszewski, J. L. Wallberg, and S. Rezeq et al., “All-digital PLL Labs and postdoctoral work at Harvard in image
and transmitter for mobile phones,” IEEE J. Solid-State Circuits, vol. sensor design, he joined the IBM Thomas J. Watson
40, no. 12, pp. 2469–2482, Dec. 2005. Research Center, Yorktown Heights, NY, in 1994.
[3] J. Dunning, G. Garcia, J. Lundberg, and E. Nuckolls, “An all-digital His initial work at IBM was the design of analog
phase-locked loop with 50-cycle lock time suitable for high-perfor- circuits and air interface protocols for field-powered
mance microprocessors,” IEEE J. Solid-State Circuits, vol. 30, no. 4, RFID tags. In 1999, he turned his focus to analog
pp. 412–422, Apr. 1995. circuit design for high-speed serial data communi-
[4] P. Hanumolu, M. Kim, G. Wei, and U. Moon, “A 1.6 Gbps digital clock cation. Since June 2000, he has managed a team of circuit designers working
and data recovery circuit,” in Proc. IEEE Custom Integrated Circuits on serial data communication, wireless, and PLL applications. In addition to
Conf. (CICC), Sep. 2006, pp. 603–606. circuits papers regarding serial links, he has published articles on imagers
[5] V. Kratyuk, P. Hanumolu, K. Ok, K. Mayaram, and U. Moon, “A dig- and RFID, and he holds more than 20 patents. His current research interests
ital PLL with a stochastic time-to-digital converter,” in Proc. Symp. include high-speed I/O design, PLL design, and circuit/system approaches for
VLSI Circuits, Jun. 2006, pp. 38–39. variability compensation.

View publication stats

You might also like