0% found this document useful (0 votes)
61 views4 pages

C:JC:J: A Low-Power Digital Frequency Divider For System-on-a-Chip Applications

The document describes a proposed low-power digital frequency divider architecture for system-on-a-chip applications. The divider uses a coarse-fine architecture, with a coarse block operating at a lower frequency to reduce power consumption, and a fine block operating at the input frequency during output transition time slots. This allows lower power than a conventional synchronous divider while maintaining advantages like a wide division range and ease of programmability. The architecture was implemented on an FPGA to validate operation and measure over 40% power reduction compared to a standard divider.

Uploaded by

hanumantha12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views4 pages

C:JC:J: A Low-Power Digital Frequency Divider For System-on-a-Chip Applications

The document describes a proposed low-power digital frequency divider architecture for system-on-a-chip applications. The divider uses a coarse-fine architecture, with a coarse block operating at a lower frequency to reduce power consumption, and a fine block operating at the input frequency during output transition time slots. This allows lower power than a conventional synchronous divider while maintaining advantages like a wide division range and ease of programmability. The architecture was implemented on an FPGA to validate operation and measure over 40% power reduction compared to a standard divider.

Uploaded by

hanumantha12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

A Low-Power Digital Frequency Divider for

System-on-a-Chip Applications
Hcshum mrun
1

2
, Khulcd 5hmul
2
und Mugdt bruhtm
2
'Electrical Engineering Program
King Abdullah University of Science and Technology (KAUST)
Thuwal, Kingdom of Saudi Arabia z-::--aa
Email: hesham.oran@kaust.edu.sa
2
Electronics and Communications Engineering Department,
Faculty of Engineering, Ain Shams University, Cairo, Egypt
A05Ir0cIu this paper, an idea for a new frequency divider
architecture is proposed. The divider is based on a coarse
fne architecture. The coarse block operates at a low frequency
to save power consumption and it selectively enables the fne
block which operates at the high input frequency. The proposed
divider has the advantages of synchronous divider, but with lower
power consumption and higher operation speed. The design can
achieve a wide division range with a minor efect on power
consumption and speed. The architecture was implemented on
a complex programmable logic device (CPLD) to verify its
operation. Experimental measurements validate system operation
with power reduction greater than 40.
I. INTRODUCTION
The continuous down-scaling of minimum feature size in
IC technology allowed complex systems to be integrated on
a single chip. As depicted in Fig. . modem system-on-a
chip (SoC) platfors integrate both analog and digital blocks,
in addition to real-time functions such as audio and video.
Such applications require a clock generator to produce several
unrelated frequencies for digital blocks as well as sampled
analog circuits. A common solution is to design a phase-locked
loop (PLL) clock generator running at a high frequency that
can then be divided down to obtain all desired frequencies ,;.
,z;.
Several divider architectures were reported in the literature,
e.g., pulse-swallow dividers, cascaded dual-modulus dividers,
and phase-rotating dividers ,;,s;. These architectures are
used for very high input frequencies and designed using full
custom design fow. For moderate input frequencies (up to
few hundreds of MHz), digital designers resort to simple
synchronous dividers, due to their simple implementation,
wide and continuous range of divider ratios, and ease of pro
grammability. More importantly, they do not sufer from delay
and jitter accumulation, are not prone to timing hazards, and
are well suited for semi-custom digital IC design fow ,-;.,a;.
On the other hand, synchronous dividers have large power
consumption and limited operation speed ,;. Several trials
have been made to improve the performance of synchronous
counters ,z;.,;.However, these architectures rely on full
custom transistor level design and cannot by synthesized from
higher levels of abstraction.
CPU Core DSP Core
Configurable
Logic
cc

Video USB
Processor 12C
Audio SPI
Processor UART
'
Compression
RF Connectivity
ADCs & DACs and Encryption
(Bluetooth, Wi-Fi. .. etc)
Engine
Figure 1. Block diagram of a hypothetical complex Soc.
In this paper, a new programmable frequency divider ar
chitecture is proposed. The proposed architecture is fully
synthesizable and has all the advantages of synchronous di
viders. In addition, it provides lower power consumption and
higher operation speed. The division range of the proposed
architecture can be extended with a minor efect on power
consumption and speed.
II. BASIC IDEA
The basic idea of the proposed divider is illustrated in
Fig. z. For a conventional divider, the input stage is always
running at the high frequency input reference (fclk) to produce
a much lower frequency at the output (fout = _) , where
M is the divider ratio. The idea is to operate the divider at
a lower frequency (fcoarse = _) , where D is a parameter
that determines the ratio between f
clk
and fcoarse. This lower
frequency (fcoarse) divides the time scale into coarse time slots.
The high frequency clock (fclk) is enabled only during the
time slots of output transition to generate the output frequency
(fout).
978-1-61284-857-0/11/$26.002011|LLL
/:l-
EN
Sa/eJ
/:l-
i i
i

i. .
i i
i l
]_-
i
l
Figure 2. Simple timing diagram illustrating the basic idea of the proposed
divider. The diagram is drawn for D = 2 and M = 21.
M
LO8l8C
I
(N-b||s|mpl|||ed
modulo-V accumula|or)
L
'::ace = '>-/
lnC
(-b|Ire|oadab|e ':o= ' :-"
counIer)
L
'tre = ':-
Figure 3. Block diagram of the proposed divider.
III. SYSTEM DESCRIPTION
The block diagram of the proposed architecture is shown
in Fig. . The coarse block is implemented as an N-bit
accumulator with programmable modulus (M). The block
diagram of the accumulator is shown in Fig. 4. The increment
of the accumulator (INC) is fxed and is equal to z
o
.which
simplifes the hardware implementation, i.e., the highlighted
adder is implemented as an (N -D)-bit increment-by-one
adder instead of an N-bit full-adder . The average frequency
of the accumulator carry bit is given by:
fcarr
INC
M
xfcoarse
z
o
x
fclk
M z
o
fclk
M
(1)
which is equal to the output frequency (fout). The cary bit is
used to defne the time slots of output transition.
Carry
M
SE|
Figure 4. Block diagram of the simplifed modulo-M accumulator.
P00UUUdIC|
LUIUI

'
M
_ {_.
|NC

--J

||. |
r l.c..-
,
7 l
,

7 (l.c..-

: l

,

,
,
,
,
____ - :
JlUC
Ldl|y
I
'5a|
.
Figure 5. Timing of the modulo-M accumulator at the instant of overfow.
The fne block is implemented as a D-bit reloadable down
counter, where D is chosen to be much smaller than N. The
select word (SEL) determines the number of fclk pulses to
count before triggering the output. The calculation of SEL can
be explained with the aid of Fig. :.The correct output transi
tion should occur at the instant when the virtual extrapolated
accumulator output crosses the modulus (M). Thus, the carry
bit should be advanced by time (ta), or equivalently delayed
by time (td), noting that a fxed delay added to all output edges
does not afect the output characteristics. From Fig. :.td is
given by:
R
Tcoarse -
INC
x Tcoarse
z
o
-R,?
Tclk
SEL x Tclk .z
where R is the residue in the accumulator afer overfow
Tcoarse = _
+
, and
Tclk =
+
It should be noted that a
coJrc clk
INC = z
o
.Rand SEL are always within D-bit limit.
For the corect operation of the divider, M should be
constrained within the range:
.
where the fne counter is loaded with an initial value in one
cycle of fcoarse and then decremented in the next cycle. If
the fne counter is loaded and decremented in the same cycle
of fcoarse, the minumum divider ratio can be as low as z
o
[14]. It is clear that the divider ratio cannot be less than z
o
because the divider cannot generate two output pulses whithin
a single cycle of fcoarse. At the high end, if M is greater than
z

-z
o
.the accumulator will not be able to detect the overfow
1 elk_fine
1 elk_coarse
l en
rdukl[3:0]
aecum_i[3:0]
aecum_q[3:0]
fine_sel[ 1: 0]
fineJoad
fine_en
count[1:0]
flag
flag_q
1 elk_out
1 elk_out_di2
5,300 ns 5,400 ns 5,500 ns 5,600 ns
Figure 6. VHL timing simulation for M = 11. Figure is drawn for N = 4 and D = 2 for the purpose of clarity.
condition. VHDL timing simulation for fjine = 100 MHz and
M = 11 is shown in Fig. -.
Low-power consumption is achieved by using clock gating,
where f
clk
is enabled only within the time slots of output tran
sition. In addition, during the time slots of output transition,
fclk
is used to clock a D-bit counter instead of an N-bit counter.
The division range can be increased by increasing the width
of the accumulator without changing the fne block. Thus,
range is extended with a minor efect on power consumption
and circuit speed. The high frequency fne block is D-bit
only, thus it can achieve a higher speed of operation. Another
advantage is that the output frequency comes directly from
the high frequency fne block, which means that there is no
accumulation of delay and jitter through subsequent stages.
On the other hand, there is an area overhead due to the
accumulator.
The proposed architecture introduces a new parameter for
the designer to tune in order to control the performance of
the divider, which is the fne block width (D). From the
perspective of division range, decreasing D is always better as
it gives a wider range. But for power consumption, the relation
is not straight forward. Decreasing D leads to decreasing the
fne block power consumption. But on the other hand, fcoarse
will increase, which leads to increasing the coarse block power
consumption. Simulations indicate that there is an optimum
value for D which minimizes the total power consumption. The
efect of D on the circuit speed is also two-fold. Decreasing D
allows the fne block to be clocked at a higher frequency. But
as fcoarse increases, the delay of the accumulator may limit
the maximum circuit speed.
It should be noted that a simple D-bit binary counter (with
no programmability) is required to generate fcoarse. However, a
single counter can be shared between several divider channels,
where each cannel can have a diferent division range and
operate with any clock frequency from the binary weighted

.
J

..

ne
.


'
i : [
.
..

Fine
L

L'
.
Fine
.

L
I
Simple binary
|_` counter
I
|_
+ Z'
|a=
|_/M,
Figure 7. A example of using multi-channel coarse-fne divider to generate
multiple clocks.
counter outputs. The binary counter overhead is distributed
among several divider channels, which is a typical case in
SoC clock generation, such that the contribution per channel
is not signifcant. The initial value of the accumulator can be
controlled to provide several clocks with the same frequency
but diferent phases, which is important in many applications.
A simple example of multi-channel operation is shown in
Fig. !. Measurements indicate that signifcant power reduction
is achieved even if only two divider channels are used.
Alteratively, if fcoarse is the input to the system, a simple
PLL can be used to generate fjine.
120
100
-
N
'
80
-

1
Q
40
-

C
D
20
~LI LlvlU6|
J
~yHC LlvlU6|
10000 20000 30000 40000 50000 0000 70000
M
'4)
120

100
N
"
'

80
n

0
W
+
T
Q
40

C
0
20
10 J2 14 J 18
|og,(q
')
Figure 8. Measured power consumption vs. (a) M and (b) log2 (M).
Table I
COMP^RI8ON8E17EEN1HEPROPO8EO CO^R8E-lINE (CF) OIVIOER ^NO
8YNCHRONOu8OIVIOER,8O1HIMPiEMEN1EOONXIiINXXC2C256
CPLD.
Division Range
Powerlfclk
(IWIMHz)
Critical Path Delay
(ns)
Macrocells (Area)
CF Divider
N = 16, D = 4
32 to 65520
60
7.1
42
23 Flip-fops 147
Product-terms
Sync Divider
N = 16
2 to 65536
100
11.9
21
17 Flip-fops 38
Product-terms
IV. IMPLEMENTATION AND EXPERIMENTAL RESULTS
To verify the functionality and operation of the design,
it was implemented on a Xilinx XC2C256 CPLD. Table I
shows a comparison between the proposed coarse-fne (CF)
divider and synchronous divider. The synchronous divider
was implemented as an N-bit reloadable down counter. The
comparison illustrates the power and speed advantage, as well
as the area overhead. Power consumption was measured using
a three-channel F circuit that continuously measures the
current consumed by the CPLD core and UL banks [15]. The
variation of measured power consumption with the divider
ratio is shown in Fig. s.For low divider ratios, clock gating
is less signifcant, thus the reduction in power consumption is
not substantial. However, as the divider ratio increases, power
reduction increases to more than 40%.
V. CONCLUSION
A new coarse-fne frequency divider architecture was pro
posed. The idea of the system was described and both simula
tion and implementation results were presented. Implemen
tation results validate predictions from system description.
Compared to a synchronous divider, the proposed divider
has lower power consumption and allows higher operation
speed. The design is best suited for a multi-channel divider
architecture for SoC clock generation applications. The design
is fully synthesizable, and can be used as a parameterized
drop-in module in ASICs and FPGAs.
ACKNOWLEDGMENT
The authors would like to thank Dr. Amr Hafez of Si-Ware
Systems (SWS) and Dr. Khaled Salama of King Abdullah Uni
versity of Science and Technology (KAUST) for supporting
this work.
REFERENCES
[1] A. Fahim, Clock genertors for UCprocessors: circuits and architec
tures. Springer, 2005.
[2] / Series FPGAs Clocking Resources User Guide, Xilinx, Inc., March
2011, UG472 (v1.0).
[3] J. Rogers, l Dai, and C. Plett, Integrted circuit design for high-speed
frequency synthesis. Artech House, 2006.
[4] C. Vaucher, I. Ferencic, M. Locher, S. Sedvallson, U. Voegeli, and
Z. Wang, "A family of low-power truly modular programmable dividers
in standard 0.35-lm cmos technology;' Solid-State Circuits, IEEE
Joural of, vol. 35, no. 7, pp. 1039 -1045, July 2000.
[5] A. Lacaita, S. Levantino, and C. Samori, Integrted Frequency Synthe
sizers for Wireless Systems. Cambridge University Press, 2007.
[6] J. Craninckx and M. Steyaert, "A 1.75-GHz/3-V dual-modulus divide
by-128/129 prescaler in 0.7-lm cmos;' Solid-State Circuits, IEEE
Joural of, vol. 31, no. 7, pp. 890 -897, jul 1996.
[7] B. Floyd, "Sub-integer frequency synthesis using phase-rotating fre
quency dividers;' Circuits and Systems I: Regular Papers, IEEE Trns
actions on, vol. 55, no. 7, pp. 1823 -1833, aug. 2008.
[8] S. Wang, X. Wu, J. Wu, and M. Zhang, "Low power design of multi
modulus programmable frequency divider;' Electronics Letters, vol. 45,
no. 20, pp. 1017 -1019, 24 2009.
[9] J. Rabaey, A. Chandrakasan, and B. Nikolic, Digital integrted circuits.
Prentice Hall Englewood Cliffs, New Jersey, 2002.
[10] P. Chu, RT Hardware Design Using VHDL: Coding for Efciency,
Portability, and Scalability. John Wiley & Sons, 2006.
[II] P. Larsson and J. Yuan, "Novel carry propagation in high-speed syn
chronous counters and dividers;' Electronics Letters, vol. 29, no. 16,
pp. 1457 -1458, 1993.
[12] Y.-W. Kim, J.-S. Kim, J.-H. Oh, Y.-S. Park, J.-w. Kim, K-I. Park, B.-S.
Kong, and Y.-H. Jun, "Low-Power CMOS Synchronous Counter With
Clock Gating Embedded Into Carry Propagation;' Circuits and Systems
I: Express Briefs, IEEE Trnsactions on, vol. 56, no. 8, pp. 649 -653,
2009.
[13] J .-R. Yuan, "Effcient CMOS counter circuits," Electronics Letters,
vol. 24, no. 21, pp. 1311 -1313, Oct. 1988.
[14] H. Omran, K Sharaf, and M. Ibrahim, "An all-digital direct digital
synthesizer fully implemented on fpga," in Design and Test Workshop
(IT), 2oo94th Interational, Nov. 2009.
[15] CoolRunner-I Evaluation Board Reference Manual, Xilinx, Inc., May
2008, UG501(v1.0).

You might also like