0% found this document useful (0 votes)
93 views146 pages

Pham MASc Thesis

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
93 views146 pages

Pham MASc Thesis

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 146

Time-interleaved ∆Σ-DAC

for Broadband Wireless Applications

by

Jennifer Pham

A thesis submitted in conformity with the requirements


for the degree of Master of Applied Science
Graduate Department of Electrical and Computer Engineering
University of Toronto

Copyright
c by Jennifer Pham, 2007
Time-interleaved ∆Σ-DAC
for Broadband Wireless Applications

Jennifer Pham
Master of Applied Science, 2007
Graduate Department of Electrical and Computer Engineering
University of Toronto

Abstract
The analysis and design of a time-interleaved delta-sigma digital-to-analog converter
(TIM ∆Σ-DAC) is presented. The digital front-end of the TIM ∆Σ-DAC comprises a 95th -
order time-interleaved-by-8 FIR interpolation filter and a 3rd -order time-interleaved-by-8 ∆Σ
modulator. The time-interleaved architecture uses parallelism to support a low OSR of 8,
which results in a large effective bandwidth for broadband applications. The 4-bit output of
the ∆Σ modulator is converted into analog using 16 current-steering cells with continuous
current calibration. The chip was fabricated in 90nm CMOS. It was designed to operate
at 4GS/s with a bandwidth of 250MHz. The analog back-end was tested with modulated
data from a simulation of the digital front-end. It was measured at 2.66GS/s and achieved
a bandwidth of 166MHz, an SNR of 46dB and an SFDR of 56dB. At 2GS/s, the prototype
consumed 102mW from a 1V supply.

iii
iv
Acknowledgments
Throughout the course of my thesis work, I have encountered numerous obstacles, at
which there was always someone coming along with ingenuity, inspiration, and encourage-
ment. There are so many people I would like to thank.

First of all, I am truly grateful to my supervisor, Tony Chan Carusone, who has given me
continuous support and insight throughout this work. He gave me the freedom of research
and motivated me to explore the field where I was a complete stranger. He has been like a
friend who is always there to help and to listen.

I would also like to thank my colleagues for lending a hand whenever I got caught in
the midst of confusion. Without their support, I would have much trouble completing this
work. In particular, I would like to express my gratitude to Kentaro Yamamoto and Joseph
Aziz who assisted me in the CAD design and experimental testing; Tyler Brandon from
University of Alberta who patiently fixed countless DRC problems and guided me through
the maze of 90nm CMOS Place & Route; Ahmad Darabiha, Ian Kuon, and Zdravko Lukic
who supported me at different phases of the digital design flow; Keith Tang who provided me
with custom RF pads; Marcus van Ierssel, Oleksiy Tyshchenko, and Cintia Man who helped
me with the PCB design and digital test setup; and Jaro Pristupa who saved me in many
CAD tool panics. I also would like to thank my peers in BA5000 for the endless laughters
and priceless memories.

To my family, who never stopped encouraging and believing.

To Darren, who never stopped loving and caring.

v
vi
Table of Contents

List of Figures xiv

List of Tables xv

List of Acronyms xvii

1 Introduction 1
1.1 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Oversampled ∆Σ vs. Nyquist-rate DAC . . . . . . . . . . . . . . . . 3
1.1.2 Lowpass vs. Bandpass ∆Σ-DAC . . . . . . . . . . . . . . . . . . . . . 4
1.1.3 Time-interleaved vs. Conventional ∆Σ-DAC . . . . . . . . . . . . . . 4
1.1.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2 State-of-the-Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Theoretical Background 13
2.1 System Architecture for ∆Σ-DAC . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 ∆Σ Modulator Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.1 Error-Feedback ∆Σ Modulator Architecture . . . . . . . . . . . . . . 16
2.2.2 Error-Feedback ∆Σ Modulator Stability Analysis . . . . . . . . . . . 16
2.3 Time-interleaved ∆Σ Modulator . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3.1 Polyphase Decomposition . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3.2 Block Digital Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3.3 Time-interleaved Error-Feedback ∆Σ Modulator . . . . . . . . . . . . 23
2.3.4 Practical Considerations . . . . . . . . . . . . . . . . . . . . . . . . . 25

vii
viii Table of Contents

2.4 Time-interleaved Interpolation Filter . . . . . . . . . . . . . . . . . . . . . . 26


2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3 Time-interleaved ∆Σ-DAC Design 29


3.1 Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Time-interleaved Interpolation Filter . . . . . . . . . . . . . . . . . . . . . . 30
3.3 Time-interleaved ∆Σ Modulator . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3.1 DSM Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3.2 NTF Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3.3 Time-interleaved DSM . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3.4 TIM-DSM Performance . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.4 Digital-to-Analog Converter Model . . . . . . . . . . . . . . . . . . . . . . . 43
3.5 Analog Reconstruction Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4 Time-interleaved ∆Σ-DAC Implementation 47


4.1 Digital Baseband Front-End . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.1.1 Hardware Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.1.2 Accuracy Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.1.3 Speed Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.1.4 Time-interleaved Interpolation Filter . . . . . . . . . . . . . . . . . . 53
4.1.5 Time-interleaved ∆Σ Modulator . . . . . . . . . . . . . . . . . . . . 56
4.1.6 Digital Integrated Circuits Design Flow . . . . . . . . . . . . . . . . . 58
4.1.7 Digital Front-end Simulation Results . . . . . . . . . . . . . . . . . . 60
4.2 High-Speed Digital Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2.1 Multiplexer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2.2 Binary-to-Thermometer Converter and Switch Drivers . . . . . . . . 67
4.2.3 High-Speed Digital Interface Simulation Results . . . . . . . . . . . . 68
4.3 High Speed Analog Back-End . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.3.1 Current Calibration Circuitry . . . . . . . . . . . . . . . . . . . . . . 70
4.3.2 Current-Steering Digital-to-Analog Converter . . . . . . . . . . . . . 73
Table of Contents ix

4.3.3 Analog Back-end Simulation Results . . . . . . . . . . . . . . . . . . 80


4.4 TIM ∆Σ-DAC Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

5 Time-interleaved ∆Σ-DAC Performance 87


5.1 PCB Design and Test Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.2 Digital Design Issues and Solutions . . . . . . . . . . . . . . . . . . . . . . . 91
5.3 High Speed Analog Measurements . . . . . . . . . . . . . . . . . . . . . . . . 92
5.3.1 Initial Verifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.3.2 Accuracy Measurements . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.3.3 Linearity Measurements . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.3.4 Power Consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.3.5 Performance Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 100

6 Conclusions 101
6.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

A Conventional ∆Σ Modulator 105

B TIM ∆Σ-DAC Matlab Results 107


B.1 Analog Reconstruction Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
B.2 TIM-IF-DSM Output Spectrum with DAC Mismatches . . . . . . . . . . . . 111

C TIM ∆Σ-DAC Implementation 113


C.1 TIM-IF Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
C.2 TIM-IF Sum Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
C.3 TIM-IF and TIM-DSM Timing Synthesis . . . . . . . . . . . . . . . . . . . . 116
C.4 Binary-to-Thermometer Converter and Switch Drivers . . . . . . . . . . . . . 117
C.5 Current Calibration Principles . . . . . . . . . . . . . . . . . . . . . . . . . . 118

References 120
x Table of Contents
List of Figures

1.1 Block diagram of a 60GHz radio . . . . . . . . . . . . . . . . . . . . . . . . . 2


1.2 Parallel ∆Σ modulator based on Frequency Division Multiplexing (FDM) . . 5
1.3 Parallel ∆Σ modulator based on Code Division Multiplexing (CDM) . . . . 6
1.4 Parallel ∆Σ modulator based on Time Division Multiplexing (TDM) . . . . 7
1.5 TIM ∆Σ modulator based on digital block filtering . . . . . . . . . . . . . . 7

2.1 ∆Σ-DAC block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14


2.2 Spectrum at each internal node in ∆Σ-DAC [1] . . . . . . . . . . . . . . . . 14
2.3 Linear model of a single-bit error-feedback ∆Σ modulator . . . . . . . . . . . 16
2.4 Single-bit error-feedback ∆Σ modulator with digital limiter . . . . . . . . . . 17
2.5 Bit-wise analysis of error-feedback ∆Σ modulator . . . . . . . . . . . . . . . 18
2.6 Stable error-feedback ∆Σ modulator . . . . . . . . . . . . . . . . . . . . . . 19
2.7 (a) Scalar transfer function, (b) Time-interleaved-by-M version [2] . . . . . . 22
2.8 Linear model of time-interleaved error-feedback ∆Σ modulator . . . . . . . . 24
2.9 First-order time-interleaved-by-two ∆Σ modulator . . . . . . . . . . . . . . . 24
2.10 Conventional interpolation filter . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.11 Time-interleaved interpolation filter and time-interleaved ∆Σ modulator . . 27

3.1 Time-interleaved-by-8 ∆Σ-DAC block diagram . . . . . . . . . . . . . . . . . 30


3.2 A 95th -order FIR interpolation filter with and without coefficient quantization 32
3.3 A 95th -order FIR interpolation filter block diagram . . . . . . . . . . . . . . 34
3.4 ∆Σ modulator noise transfer function optimization . . . . . . . . . . . . . . 37
3.5 Conventional 3rd -order error-feedback ∆Σ modulator architecture . . . . . . 37
3.6 Time-interleaved-by-8 3rd -order error feedback ∆Σ modulator . . . . . . . . 39

xi
xii List of Figures

3.7 TIM-IF-DSM versus conventional DSM response (Matlab simulations) . . . . 40


3.8 TIM-IF-DSM performance for a single tone at 0.25fB for non-optimized, op-
timized and quantized optimized NTF (Matlab simulations) . . . . . . . . . 41
3.9 TIM-IF-DSM output spectrum for Matlab simulations with 0dBFS input am-
plitude at different frequencies a) 0.13fB b) 0.25fB c) 0.50fB d) 0.93fB . . . 42
3.10 TIM-IF-DSM response (Matlab simulations) . . . . . . . . . . . . . . . . . . 43
3.11 Time-interleaved-by-8 ∆Σ-DAC architecture . . . . . . . . . . . . . . . . . . 46

4.1 a) Conventional ∆Σ-DAC b) Time-interleaved-by-8 ∆Σ-DAC . . . . . . . . 47


4.2 Error reduction rounding scheme . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3 Example of an 8-bit CSA with 1-1-1-2-3 staging . . . . . . . . . . . . . . . . 52
4.4 TIM-IF Physical Implementation . . . . . . . . . . . . . . . . . . . . . . . . 54
4.5 TIM-IF sum tree for path 2 and 8 . . . . . . . . . . . . . . . . . . . . . . . . 55
4.6 TIM-DSM sum tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.7 Digital design flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.8 TIM-IF-DSM output spectrum for VHDL behavioural simulations with 0dBFS
input amplitude at different frequencies: a) 0.13fB b) 0.25fB c) 0.50fB d) 0.93fB 61
4.9 TIM-IF-DSM VHDL Behavioural vs. Matlab Response . . . . . . . . . . . . 62
4.10 An 8-to-1 ring multiplexer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.11 Timing diagram for an 8-to-1 ring multiplexer . . . . . . . . . . . . . . . . . 65
4.12 An 8-to-1 ring multiplexer transient response (TT corner) a) 4GHz b) 2GHz 66
4.13 High-speed digital interface theoretical response . . . . . . . . . . . . . . . . 69
4.14 High-speed digital interface Cadence transient response (TT corner) . . . . . 69
4.15 Current calibration implementation . . . . . . . . . . . . . . . . . . . . . . . 71
4.16 Continuous current calibration system for 4-bit DAC . . . . . . . . . . . . . 72
4.17 a) Bias current mirror b) Dummy calibration cell schematic . . . . . . . . 73
4.18 Current-steering cell with self-calibration circuitry . . . . . . . . . . . . . . . 74
4.19 a) Output swing b) Output noise model c) Simplified output noise model . . 75
4.20 Current-steering DAC output load options . . . . . . . . . . . . . . . . . . . 77
4.21 Active vs. Passive load output . . . . . . . . . . . . . . . . . . . . . . . . . . 78
List of Figures xiii

4.22 DNL offset Monte Carlo analysis . . . . . . . . . . . . . . . . . . . . . . . . 80


4.23 TIM ∆Σ-DAC performance with and without current calibration (for active
load, typical corner with transistor mismatch) . . . . . . . . . . . . . . . . . 81
4.24 TIM ∆Σ-DAC’s SNR/SNDR vs. Input amplitude for a single-tone input at
0.25fB (TT corner) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.25 TIM ∆Σ-DAC’s SNR/SNDR vs. Input frequency for a single-tone amplitude
of 0dBFS (TT corner) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.26 a) Divide-by-8 clock divider b) I/O Driver . . . . . . . . . . . . . . . . . . 83
4.27 TIM ∆Σ-DAC floor planning . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.28 TIM ∆Σ-DAC final layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

5.1 Die photos of the TIM ∆Σ-DAC chip fabricated in 90nm CMOS . . . . . . . 88
5.2 TIM ∆Σ-DAC prototype a) Packaging and b) Testboard . . . . . . . . . 88
5.3 Analog back-end test flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.4 Full test setup for Agilent 93K SOC or Agilent ParBert platform . . . . . . . 90
5.5 Experimental setup for analog back-end . . . . . . . . . . . . . . . . . . . . . 90
5.6 Current-steering DAC stair case transient response . . . . . . . . . . . . . . 92
5.7 Output spectrum with calibration feed-through . . . . . . . . . . . . . . . . 93
5.8 Clock Divider Transient Response . . . . . . . . . . . . . . . . . . . . . . . . 94
5.9 CS-DAC transient response for a single-tone, 0dBFS input amplitude (top -
single ended outputs; bottom - differential output) . . . . . . . . . . . . . . . 94
5.10 Noise shape and inband spectra for a single-tone, 0dBFS input amplitude at
0.13fB and 0.29fB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.11 CS-DAC accuracy performance with single-tone input and passive load . . . 96
5.12 Two-tone spectrum and SFDR measurements . . . . . . . . . . . . . . . . . 97
5.13 Multi-tone Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

A.1 Linear model of first-order ∆Σ modulator . . . . . . . . . . . . . . . . . . . 105

B.1 A 7th -order elliptic analog filter response . . . . . . . . . . . . . . . . . . . . 108


xiv List of Figures

B.2 TIM ∆Σ-DAC output spectrum with analog LPF for Matlab simulations with
0dBFS input amplitude at different input frequencies a) 0.13fB b) 0.25fB c)
0.50fB d) 0.93fB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
B.3 TIM ∆Σ-DAC response with an ideal vs. analog filter . . . . . . . . . . . . . 110
B.4 TIM-IF-DSM output spectrum with thermometer DAC element mismatches 111

C.1 TIM-IF sum tree for path 3 and 7 . . . . . . . . . . . . . . . . . . . . . . . . 114


C.2 TIM-IF sum tree for path 4 and 6 . . . . . . . . . . . . . . . . . . . . . . . . 115
C.3 TIM-IF sum tree for path 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
C.4 Binary-to-thermometer schematic . . . . . . . . . . . . . . . . . . . . . . . . 117
C.5 Switch driver schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
C.6 Calibration principle a) Calibration b) Operation . . . . . . . . . . . . . . . 118
List of Tables

1.1 ∆Σ-DAC Design Specifications . . . . . . . . . . . . . . . . . . . . . . . . . 8

3.1 Interpolation Filter Characteristics . . . . . . . . . . . . . . . . . . . . . . . 32

4.1 CSA Staging Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53


4.2 Binary-to-thermometer conversion and gate logic . . . . . . . . . . . . . . . 67
4.3 Analog Back-end Transistor Properties . . . . . . . . . . . . . . . . . . . . . 79
4.4 TIM ∆Σ-DAC Simulated Power Consumption . . . . . . . . . . . . . . . . . 81

5.1 TIM ∆Σ-DAC Power Consumption . . . . . . . . . . . . . . . . . . . . . . . 99


5.2 TIM ∆Σ-DAC Performance Summary . . . . . . . . . . . . . . . . . . . . . 100

6.1 TIM ∆Σ-DAC Performance Comparisons . . . . . . . . . . . . . . . . . . . . 102

B.1 Analog Low-pass Filter Characteristics . . . . . . . . . . . . . . . . . . . . . 108

C.1 A 95th -order Time-interleaved-by-8 Interpolation Filter Coefficients . . . . . 113


C.2 TIM-IF Synthesized Performance . . . . . . . . . . . . . . . . . . . . . . . . 116
C.3 TIM-DSM Synthesized Performance . . . . . . . . . . . . . . . . . . . . . . . 116

xv
xvi List of Tables
List of Acronyms

ADC Analog-to-Digital Converter


B2T Binary-to-Thermometer
CDM Code Division Multiplexing
CIFB Cascaded Integrators with distributed Feedback
CIFF Cascaded Integrators with Feed Forward coupling
CLA Carry Look-ahead Adder
CRFB Cascaded Resonators with distributed Feedback
CRFF Cascaded Resonators with Feed Forward coupling
CS Current-Steering
CSA Carry Select Adder
CSD Canonic Sign Digit
CS-DAC Current-Steering Digital-to-Analog Converter
DAC Digital-to-Analog Converter
dBFS Decibel with respect to Full-Scale
DFF D-flipflop
DFT Discrete Fourier Transform
DR Dynamic Range
DRC Design Rules Check
DSM Delta-Sigma Modulator
DSP Digital Signal Processing
DWA Data-Weighted Averaging
EFB Error-Feedback
EFB-DSM Error-Feedback Delta-Sigma Modulator
ENOB Effective Number of Bits
FCC Federal Communications Commission
FDM Frequency Division Multiplexing
FIR Finite Impulse Response

xvii
xviii List of Acronyms

FO2 Fanout-of-2
FVT Filter Visualization Tool
IF Interpolation Filter
IIR Infinite Impulse Response
ILA Individual Level Averaging
IM2 Second-order Intermodulation
IM3 Third-order Intermodulation
IST Information Society Technologies
LPF Low-pass Filter
LSB Least Significant Bit
LTI Linear Time Invariant
LVS Layout Versus Schematic
MASH Multi-Stage Noise Shaping
MSB Most Significant Bit
MTPR Multi-tone Power Ratio
NTF Noise Transfer Function
OBG Out-of-band Gain
OFB Output-Feedback
OFDM Orthogonal Frequency Division Multiplexing
OSR Oversampling Ratio
ParBert Parallel Bit-Error-Rate
PCB Printed Circuit Board
PLL Phase Locked Loop
PVT Process, Voltage, and Temperature
RCA Ripple Carry Adder
RTL Register Transfer Level
SFDR Spurious-Free Dynamic Range
SISO Single-Input Single-Output
SNDR Signal to Noise plus Distortion Ratio
SNR Signal to Noise Ratio
List of Acronyms xix

STF Signal Transfer Function


STI Shallow Trench Isolation
TDM Time Division Multiplexing
TG Transmission Gate
THD Total Harmonic Distortion
TIM-DSM Time-interleaved Delta Sigma Modulator
TIM-IF Time-interleaved Interpolation Filter
UWB Ultra-Wide Band
WPAN Wireless Personal Area Networks
xx List of Acronyms
Chapter 1
Introduction

In recent years, there exists a great competition for ultra-high data rate wireless com-
munication systems to meet the emergence of broadband multimedia applications. The
demand for wideband wireless personal area networking (WPAN) or wireless local area net-
work (WLAN), and point-to-point or point-to-multipoint data links continuously pushes the
capacity of wireless networks. Currently, the transfer capacity exceeds what can be accom-
modated in the widely used unlicensed bands (2.4GHz and 5.8GHz) for WLAN systems.
An alternative solution is to resort to higher bands where bandwidth is abundant. Namely,
in 2001, the Federal Communications Commission (FCC) set aside a continuous unlicensed
block of 7GHz of spectrum between 57 and 64 GHz for short-range indoor WPAN/WLAN
applications. A year later, the FCC opened up a licensed block 7.5GHz of spectrum between
3.1 GHz to 10.6 GHz for short-range indoor ultra wideband (UWB) applications [3].

As the complexity of a wireless communication system grows rapidly, research on of


analog-to-digital (ADC) and digital-to-analog converters (DAC) have became completely
independent of each other. While much of the research attention has been focused on the
design of oversampled delta-sigma ADCs (∆Σ-ADCs), their counterparts, delta-sigma DACs
(∆Σ-DACs), have been lagging behind. In reality, ∆Σ-DACs are as equally important and
their implementations are often as challenging as those of ∆Σ-ADCs.

1
2 Chapter 1. Introduction

1.1. Motivations

subsectionApplications There has been continuous research by the European IST (Infor-
mation Society Technologies) on system integration for 60GHz radios. They had proposed
a hybrid dual-frequency system called BROADWAY which is based on an integration of
HIPERLAN/2 (an existing 802.11a WLAN at 5GHz) and HIPERSPOT (an innovative fully
ad-hoc extension at 60GHz). As a result, the intermediate frequency is purposely taken at
5GHz as shown in figure 1.1. This integration will result in wider acceptance and lower cost
for both systems while providing a new solution for dense urban deployments [4].

Figure 1.1: Block diagram of a 60GHz radio

While the final standards for 60GHz WPAN/WLAN and UWB are still under debate,
both systems are likely to employ multi-band OFDM (orthogonal frequency division multi-
plexing) schemes capable of transmitting data in the 500Mb/s range [4, 5, 6, 7]. The DAC
accuracy required for these systems ranges anywhere from 4-5 bits for an UWB transmitter
[6, 7] to 8-10 bits for a 60GHz transmitter [8]. Since the 60GHz transmitter demands more
rigorous requirements, this work will focus on designing a DAC for this application which
will certainly suffice for an UWB design.
1.1. Motivations 3

1.1.1. Oversampled ∆Σ vs. Nyquist-rate DAC


In the past, Nyquist-rate DACs were a popular choice for high speed, wideband data con-
version while oversampled ∆Σ-DACs were favourable for high resolution and high linearity.
In Nyquist-rate DACs, the design of a subsequent analog reconstruction filter can be
quite challenging due to its sharp cut-off, high attenuation requirements. Furthermore, they
require a large analog circuitry, which is susceptible to potential analog circuit mismatches
particularly in deep sub-micron processes (i.e.: 90nm, 65nm, etc). As CMOS is scaling, the
matching factor degrades rapidly and becomes highly dependent on physical geometry and
layout. Also, the severe short channel effects not only increase leakage current, but also in-
tensify electrical fluctuations such as the deviation of threshold voltage [9, 10]. Consequently,
future design trends lean toward a simple and small analog circuitry.
In oversampled ∆Σ-DACs, more design emphasis is placed on the digital front-end, thus
relaxing the requirements on the analog back-end. For example, with the integration of
a digital interpolation filter, the requirements on the analog reconstruction filter can be
reduced. Also, since a ∆Σ modulator (DSM) modulates a multi-bit stream to only a few
bits, this cuts down the number of analog unit elements significantly.
However, to date, most ∆Σ-DACs are for high linearity, high resolution, and narrow
bandwidth applications such as digital audio. For 60GHz/UWB radio where wideband is
the highest priority, the ∆Σ-DACs face a major challenge due to their narrow effective
bandwidth as a result of large oversampling ratios (OSR). While some published Nyquist
DACs can certainly meet the bandwidth requirement (e.g: [11, 12, 13, 14]), there has been
relatively little research on high-speed, wideband ∆Σ-DACs.
Thus, this motivates the research of a high-speed, wideband ∆Σ-DAC to meet the de-
mands of broadband wireless applications, and to accommodate the analog design challenges
in deep sub-micron processes.
4 Chapter 1. Introduction

1.1.2. Lowpass vs. Bandpass ∆Σ-DAC


In a transmitter design, there are two design choices for the DAC: bandpass ∆Σ or lowpass
∆Σ. For a bandpass design, the in-phase (I) and quadrature (Q) data components from
the digital baseband are upconverted to an intermediate frequency before feeding into a
quadrature bandpass ∆Σ-DAC. To date, ∆Σ-DACs employing bandpass modulation are
still uncommon (eg: [15, 16, 17]), let alone quadrature ∆Σ-DACs. Research into the utility
of quadrature ∆Σ-DACs indicates some promise [15, 18]. Nevertheless, at this point, the
research is demonstrating feasibility rather than demonstrating significant advantages.
For a lowpass design, a quadrature lowpass ∆Σ-DAC is equivalent to a pair of lowpass
modulators operating independently on the quadrature data components [1]. Thus, each I
or Q data component can operate on its own lowpass ∆Σ-DAC before upconverting to an
intermediate frequency, as depicted in figure 1.1. Certainly, quadrature mismatches between
the ∆Σ-DACs of the two channels will be a source of degradation. One method to combat
this error is through the use of quadrature mismatch shaping [19]. However, this topic is
beyond the scope of this work. In general, compared to the bandpass option, the lowpass
design shows a much lower level of complexity and has been proved to be feasible. Hence,
this motivates the design of a lowpass ∆Σ-DAC.

1.1.3. Time-interleaved vs. Conventional ∆Σ-DAC


One way to achieve high bandwidth in a ∆Σ-DAC is to reduce the OSR. Some approaches to
effectively reduce the OSR while maintaining the resolution are to either increase the order
of the ∆Σ modulator (DSM), or to increase the number of converter bits, or a combination of
both. Unfortunately, increasing the DSM order can lead to loop instability hence reducing
the resolution and input dynamic range. An alternative to this problem is to employ a
multistage noise shaping (MASH) ∆Σ loop. The main disadvantage of this scheme is that
one must deal with analog mismatches when combining the stages’ outputs together. On
the other hand, increasing the number of converter bits can lead to high non-linearity due
to component mismatches in the subsequent DAC. Often, additional circuitry is required to
correct for these errors such as calibration, mismatch shaping, or digital correction. The DAC
1.1. Motivations 5

complexity and its error-correction circuitry grows exponentially by a factor of 2k for a k-bit
DAC. Typically, k is chosen to be between 4-6 bits for practical implementations [1]. Some
sophisticated designs for ∆Σ-DACs with over 6 bits such as dual-truncation or segmentation
can also yield high SNR performance results but do not alleviate the non-linearity errors.
Another way to increase the bandwidth of a ∆Σ-DAC is to parallelize the DSM into
multiple channels operating at lower speeds then combine their outputs. Parallel DSM can
be categorized into three groups based on different multiplexing schemes: frequency divi-
sion multiplexing (FDM), code division multiplexing (CDM), and time division multiplexing
(TDM).
Figure 1.2 shows the block diagram of a parallel system based on a FDM scheme. This
system contains a bank of parallel bandpass DSMs which have different band reject noise
transfer functions and operate on different frequency sub-bands [20]. A digital bandpass
filter attenuates the out-of-band noise in each channel and allows for recombination of the
frequency-decomposed input signal [21]. This system has a high level of design and hardware
complexity due to the requirement of many bandpass DSMs and bandpass filters, each with
different center frequencies.

Figure 1.2: Parallel ∆Σ modulator based on Frequency Division Multiplexing (FDM)

Figure 1.3 shows the block diagram of a parallel system based on a CDM scheme, which
is also known as a Π − ∆Σ modulator. In [22], a Hadamard transformation is used to
decompose the input into multiple spread spectrum channels by modulating it with different
6 Chapter 1. Introduction

Hadamard sequences. Each of these channels is then modulated by a DSM, filtered by a


decimation filter, and demodulated by a delayed version of the same Hadamard sequences
before adding the channels together. There is still very little research on this parallel system
for a DAC application.

Figure 1.3: Parallel ∆Σ modulator based on Code Division Multiplexing (CDM)

Lastly, figure 1.4 shows the block diagram of a parallel system based on a TDM scheme.
Among the three different structures, this is the simplest form of parallelism. Here, the
input is demultiplexed to M channels in which each operates at (1/M ) of the input sampling
frequency. The channels are then recombined through unit delays and a multiplexer. How-
ever, this brute-force approach often results in a small SNR improvement. Specifically, there
is only a 3dB-improvement in SNR for each doubling of the number of parallel modulators
regardless of their order [2].
Alternatively, a novel approach to modify the TDM scheme proposed by Khoini-Poorfard
et al. has significantly improved the SNR while meeting the low OSR requirement. In [2, 23],
a block digital filtering technique was used to successfully transform a conventional DSM
into a time-interleaved DSM with interconnecting channels. By having M interconnected
channels running in parallel as shown in figure 1.5, the total effective sampling rate becomes
M times the sampling rate of each channel. The improvement in SNR is 6(n + 1/2)dB for
1.1. Motivations 7

Figure 1.4: Parallel ∆Σ modulator based on Time Division Multiplexing (TDM)

each doubling of the number of nth order modulators [2]. Furthermore, the preceding in-
terpolation filter can be time-interleaved based on a polyphase decomposition without an
increase in hardware complexity. Hence, this motivates the choice of time-interleaved DSM
architecture for this work. Further details on the time-interleaved ∆Σ-DAC (TIM ∆Σ-DAC)
will be discussed later in chapter 2. To distinguish it from the conventional TDM scheme,
this modified TDM scheme will be referred as TIM from this point onward.

Figure 1.5: TIM ∆Σ modulator based on digital block filtering


8 Chapter 1. Introduction

1.1.4. Summary
In summary, the motivations for this work are:

• To push ∆Σ-DAC designs to higher speeds to meet the demands of broadband wireless
applications and to accommodate the design challenges of deep-sub micron processes.

• To employ a lowpass ∆Σ-DAC design due to its reasonable level of complexity and
feasibility.

• To employ a time-interleaved ∆Σ-DAC design to achieve broadband and high resolution


performance.

Table 1.1 summarizes the design targets of this work. Due to the speed and system
integration requirements, the technology is chosen to be STMicroelectronics 90nm CMOS,
1V supply process.

Table 1.1: ∆Σ-DAC Design Specifications


Design Parameter Value Units

Data Rate (fN ) 500 MS/s


Data Bandwidth (fB ) 250 MHz
Oversampling Ratio (OSR) 8 -
Sampling Rate (fS ) 4 GS/s
Resolution (EN OB) 9 bits
Power 100 - 120 mW
Modulator Architecture Lowpass, time-interleaved ∆Σ
Process ST 90nm CMOS 7M2T, 1V supply
Applications UWB or 60GHz WPAN/WLAN
1.2. State-of-the-Art 9

1.2. State-of-the-Art
As mentioned earlier, some published Nyquist DACs using CMOS technology can meet the
required specifications shown in table 1.1. For example, a recently published Nyquist DAC in
[11] has a measured resolution of at least 8 bits up to a bandwidth of 193MHz at a sampling
rate of 800MS/s while consuming 49mW. In [12] and [14], the bandwidth is up to 250MHz
for a sampling rate of 500MS/s and a resolution of 12 and 10 bits while consuming 216mW
and 125mW, respectively. Another impressive design in 0.35µm CMOS [13] goes beyond
the required specifications with a bandwidth of 500MHz for a sampling rate of 1Gs/s and
10-bit resolution while consuming 110mW. For a conventional (i.e.: without parallelism)
lowpass ∆Σ-DAC fabricated in CMOS technology, the most relevant work in 0.5µm could
only achieve 5MHz bandwidth [24].

The idea of TDM parallelism has been introduced for many years but is still uncommon
in DAC applications. For instance, the DAC in [18] employs a heterodyne technique to
commutate the output of the polyphase interpolation filter into multiple parallel paths.
Each path has its own DSM to perform the modulation which is then time-interleaved in
the digital domain before feeding into the single DAC. While time-interleaving, the parallel
spectra are aligned such that the signal-band experiences coherent gain while the noise-band
experiences destructive cancellation. Although only simulation results were presented in [18],
it promises a solution for a wideband ∆Σ-DAC, as well as the possibility of a quadrature
parallel ∆Σ-DAC.

A design fabricated in 0.13µm CMOS utilizing TDM parallelism is presented in [25].


Here, the digital input stream is modulated into a 6-bit digital output through a 3rd -order
DSM being oversampled by a factor of 6. There are two parallel channels with each operating
on a separate DAC in a time-interleaving manner. This design achieves a signal bandwidth
of 29MHz at an oversampling rate of 350MHz and a resolution of almost 12 bits while
consuming 62mW. Although this design has an impressive resolution, its signal bandwidth
is insufficient for broadband applications.

Similar to the case of conventional ∆Σ modulation, much research effort has been focused
on TIM ∆Σ-ADCs while TIM ∆Σ-DACs have been largely overlooked. In [26], a 2nd -order
10 Chapter 1. Introduction

time-interleaved-by-2 (TIM2) DSM with a single-bit DAC is implemented on an FPGA chip


just to show the significant SNR improvement (i.e.: 15dB) over that of a conventional ∆Σ-
DAC. In [27], the simulation results for a 2nd -order MASH structure with TIM4 and a 6-bit
DAC are presented. Although the four internal channels of each DSM are interconnected,
each channel has its own DAC and the recombination is done in the analog domain. This
design is prone to two sources of mismatch errors which come from the parallel DACs and
from the multistage nature of a MASH system. The simulation results based on a 0.18µm
CMOS process show a resolution of 12 bits and a signal bandwidth of 40MHz (for an effective
sampling rate of 640MS/s).
Although an interpolation filter (IF) has not yet been mentioned here, it is a required
block preceding the DSM in an oversampled ∆Σ-DAC. Its purpose is to increase the input
sampling frequency (fN ) by an OSR factor and to suppress all unwanted replicas of the signal
between baseband and OSR · fN . Aside from a brief mention of a time-interleaved simple
zero-order hold interpolator in [26], the design of an IF is omitted in all ∆Σ-DACs described
in [24], [25] and [27]. In these designs, the input is interpolated and filtered externally before
feeding into the DSM. Unlike the previous works, this thesis integrates a complete design of
a time-interleaved interpolation filter (TIM-IF).
Up to date, there is still no fabricated design reported for a TIM ∆Σ-DAC with integrated
TIM-IF. This gives an even higher level of motivation to carry out this work.

1.3. Thesis Outline


This thesis focuses on the design and implementation of a time-interleaved ∆Σ-DAC for
broadband wireless applications. The flow of the thesis is as follows.
Chapter 2 introduces an overview of ∆Σ-DAC architectures and the theoretical back-
ground of ∆Σ modulation for both conventional and time-interleaved structures. The idea
of block digital filtering is introduced as a method to transform a conventional IF and DSM
into a time-interleaved IF and DSM.
Chapter 3 discusses the system-level design of a ∆Σ-DAC using Matlab based on the pre-
vious theoretical background. Specific architectural details on each sub-block are discussed,
1.3. Thesis Outline 11

namely, the time-interleaved interpolation filter, the time-interleaved ∆Σ modulator, the


multi-bit DAC, and the analog reconstruction filter. These sub-blocks are the essential
components to form a complete TIM ∆Σ-DAC. System-level simulation results are also
presented.
The circuit and physical implementation of this TIM ∆Σ-DAC are described in chapter 4.
This design is implemented using STMicroelectronics 90nm CMOS process using Cadence,
Synopsis, VHDL, and First Encounter. The integration of this TIM ∆Σ-DAC is divided into
3 parts: a digital baseband front-end, a high-speed digital interface, and a high-speed analog
back-end. Circuit-level simulation results are also presented.
Chapter 5 presents the experimental results and some testing issues of the fabricated
TIM ∆Σ-DAC. The accuracy and linearity performance of TIM ∆Σ-DAC are measured.
Finally, chapter 6 summarizes the current work and gives suggestions for future work.
12 Chapter 1. Introduction
Chapter 2
Theoretical Background

In this chapter, the theoretical background of ∆Σ modulation is introduced. Both con-


ventional and time-interleaved ∆Σ architectures are presented with an emphasis on digital-
to-analog converter applications. A block digital filtering technique is also presented here
as a method to transform a conventional ∆Σ modulator (DSM) into a time-interleaved ∆Σ
modulator (TIM-DSM).

2.1. System Architecture for ∆Σ-DAC


Whereas in ∆Σ-ADCs, the quantization error is spectrally shaped, the noise being shaped in
∆Σ-DACs is instead the truncation error from the finite word-length of its digital circuitry.
Compared with Nyquist-rate DACs, more design emphasis is placed on the digital front-end
in ∆Σ-DACs, which allows the use of a relatively robust and simple analog back-end.

Figure 2.1 illustrates the basic architecture of a ∆Σ-DAC. The digital front-end contains
the interpolation filter (IF) and the DSM, while the analog back-end contains the multi-bit
DAC and the analog reconstruction filter. Figure 2.2 shows the signal spectrum at each
internal node of the ∆Σ-DAC. The input signal, x, is a N-bit digital stream sampled at the
Nyquist rate fN .

The IF serves two purposes: to raise the input frequency (fN ) by an oversampling ratio

13
14 Chapter 2. Theoretical Background

Figure 2.1: ∆Σ-DAC block diagram

Figure 2.2: Spectrum at each internal node in ∆Σ-DAC [1]

(OSR) and to suppress all unwanted replicas of the signal between baseband and sampling
frequency (i.e. :fS = OSR · fN ), which arise due to the upsampling. The out-of-band
attenuation of the IF improves the dynamic range of the DSM since larger signals can be
accommodated. In addition, it reduces the attenuation requirements on the analog filter
since only out-of-band truncation noise needs to be suppressed. Finally, the amount of
intermodulated out-of-band noise that can fold back into the signal band is reduced, thus
relaxing the analog filter linearity requirements.
The DSM truncates the word-length of its signal to k bits where k < N . The modulator
output contains the input signal, as well as the filtered truncation noise caused by the reduced
2.2. ∆Σ Modulator Architectures 15

word-length. Similar to an analog DSM, an ideal 1-bit modulator would yield an inherently
linear DAC. However, it may cause loop instability, as well as make the analog filter’s design a
challenging task due to the high-frequency content of the high slew-rate output. In contrast,
multi-bit modulators improve both loop stability and noise shaping capability by allowing a
higher order noise transfer function. Also, they contain less out-of-band noise and lower slew-
rate requirements which significantly reduce the complexity of the analog filter. However,
additional circuitry is required to correct for the nonlinearity of a multi-bit DAC. Overall,
the performance and design benefits outweigh the additional hardware, thus favouring the
multi-bit structure in most ∆Σ-DAC, eg: [24, 28, 29, 30, 31].
Ideally, the DAC should produce an analog signal at its output without any distortion.
Thus, its output spectrum should be identical to that at the output of the DSM. Finally, the
analog reconstruction low-pass-filter should suppress most of the out-of-band noise, leaving
only the signal spectrum within the band of interest.

2.2. ∆Σ Modulator Architectures


There are many different DSM architectures for a ∆Σ-DAC. All of the configurations (i.e.:
MASH, CIFB, CIFF, CRFB, CRFF, etc) available for ∆Σ-ADCs are also valid for ∆Σ-
DACs. Certainly, the components in these configurations are now digital instead of analog;
for example, the integrators are replaced by accumulators which use digital adders and
multipliers instead of op amps and switched capacitors. Since all signals in the modulator
are digital to begin with, there is no need for internal data conversion as in the case of ADCs.
The signal processing in the loop can be highly accurate and it is unnecessary to account
for any analog imperfections when predicting the loop behaviour. As a consequence, this
allows the use of a highly efficient error-feedback (EFB) structure which is impractical in
ADCs design. General background details on conventional ∆Σ modulator can be found in
Appendix A, while the details on EFB ∆Σ modulator are discussed here.
16 Chapter 2. Theoretical Background

2.2.1. Error-Feedback ∆Σ Modulator Architecture


The architecture for an Error-Feedback ∆Σ Modulator (EFB-DSM) is shown in figure 2.3.
While this architecture is highly efficient for DACs, it is never used in ADCs which are overly
sensitive to the imperfections of the analog loop filter H(z) and the analog subtraction needed
to generate the quantization error E(z) [1]. Unlike an analog DSM, the quantizer (Q) is now
replaced by a truncator (T) in the digital DSM.

Figure 2.3: Linear model of a single-bit error-feedback ∆Σ modulator

In the EFB structure, instead of feeding back the MSBs of the output V(z), the discarded
LSBs (i.e.: the truncation error E(z)), are fed back to the input. The digital loop filter, H(z),
is now located in the feedback path rather than in the forward path as in the conventional
DSM. Linear analysis shows that the transfer function for an EFB-DSM is:

V (z) = U (z) + [1 − H(z)]E(z) (2.1)

in which ST F (z) = 1 and N T F (z) = 1 − H(z).


For a first-order modulator, since N T F (z) = (1 − z −1 ) from equation A.1, this results in
H(z) = z −1 , or simply a single digital delay. For a nth -order EFB-DSM, the system can be
designed by solving for H(z) from the N T F (z) in equation A.3:

N T F (z) = (1 − z −1 )n = 1 − H(z)

⇒ H(z) = 1 − N T F (z) = 1 − (1 − z −1 )n (2.2)

2.2.2. Error-Feedback ∆Σ Modulator Stability Analysis


The design of a digital DSM encounters similar problems to those of an analog DSM plus some
different ones. Instead of dealing with element-matching errors and op amp non-idealities,
2.2. ∆Σ Modulator Architectures 17

the dominant errors in a digital DSM are due to coefficient truncation and round-off errors
of the digital operations [1]. Similar to an analog DSM, these can affect the modulator’s
noise shaping capability. Further details on this topic will be discussed later in chapter 4.
Higher order (i.e.: 3rd -order and higher) EFB-DSM is often chosen to achieve higher
in-band noise shaping which directly corresponds to higher ENOB. However, a high order
EFB-DSM is prone to suffer from instability when the input to the truncator (i.e: Y(z) in
figure 2.3) grows beyond the operating range of the digital number representation.
For signed or unsigned arithmetic, an overflow causes Y(z) to saturate to its largest pos-
sible value. However, for 2’s complement, overflows cause Y(z) to wrap around, implying the
output V(z) suddenly decreases with increasing Y(z). While saturation is usually acceptable,
wrap-around causes large errors and must be prevented [1]. Since 2’s complement arithmetic
operations are generally advantageous, it is critical to resolve the overflow wrap-round prob-
lem. By adding a digital limiter before the truncator [32] as shown in figure 2.4, Y(z) will
saturate before an overflow can occur.

Figure 2.4: Single-bit error-feedback ∆Σ modulator with digital limiter

In addition to the external limiter, certain conditions must be imposed on the truncator
input to improve the modulator’s robustness and stability. Much research has been focused
on improving DSM stability, yet there is not a solid theoretical explanation to predict this
behaviour for high-order DSMs. A conservative empirical rule from Lee’s criterion, which
only applies for single-bit modulators, requires the NTF’s out-of-band gain (OBG) to be
less than 1.5 (i.e.: max|N T F (ejw )| <1.5) [33]. For a multi-bit modulator, a stability con-
dition proposed by Richard Schreier [1] determines how many input truncation levels are
needed to keep the DSM stable. The condition states that for any input less than half of the
quantizer input range, A, the modulator is guaranteed not to experience overloading (i.e.:
18 Chapter 2. Theoretical Background

max|u(n)| <A/2 + 2). While these conditions ensure stability of the DSMs, they dramat-
ically reduce the input dynamic range (DR) and thus, limit the achievable performance of
higher order modulators.
A bit-wise stability analysis on EFB-DSM (figure 2.5) in [33] allows higher dynamic range,
as well as higher out-of-band gain. In the EFB architecture, since H(z) is an FIR transfer
function, there is no need for an accumulator, unlike conventional output-feedback (OFB)
topologies. Hence, the word-length at all internal nodes can be predicted without complex
numerical analysis.

Figure 2.5: Bit-wise analysis of error-feedback ∆Σ modulator

Let U(z) be a digital input stream of word-length N and T be a k-bit truncator. The
input summer adds at most 1 bit to give (N + 1) bits to the truncator. Here, the truncation
is done by simply splitting k MSB bits to V(z) and feeding back (N + 1 − k) LSB bits to the
loop filter H(z). Also, assume that the number of additional bits due to H(z) (i.e.: nH(z) ) is
the same as its order. This is a reasonable assumption since the number of taps H(z) is the
same as its order. Hence, in order to keep all internal signals bounded and as long as H(z)
is an FIR filter, the number of bits at the output of H(z) can only be at most N and thus:

(N + 1 − k) + nH(z) = N

⇒ nH(z) = k − 1 (2.3)

Equation 2.3 implies that in order to have all internal signals bounded, the order of H(z)
must be 1 less than the number of truncating bits. In other words, the stability criterion is:
An error-feedback modulator with an k-bit truncator and a loop filter of
order (k-1) is stable. [33]
2.2. ∆Σ Modulator Architectures 19

The simulation results of an EFB system in [33] based on the above criterion show a
superior performance in both stability and signal-to-noise ratio over a conventional OFB
system. The k-bit EFB system can tolerate a full-scale out-of-band gain (OBG) of 2k−1
while the equivalent OFB system can only tolerate OBG up to approximately 3.5.
The combination of both a limiter and a stability-criterion-based EFB modulator design
shown in figure 2.6 ensures that the modulator is stable and robust.

Figure 2.6: Stable error-feedback ∆Σ modulator


20 Chapter 2. Theoretical Background

2.3. Time-interleaved ∆Σ Modulator


Unlike the TDM structure in figure 1.4, a parallel ∆Σ structure proposed in [2] uses M
mutually cross-coupled DSMs, in which each operates in parallel at the same clock rate.
This results in a total effective sampling rate of M times the rate of each modulator. The
main concept is to use a polyphase decomposition and block digital filtering to transform
the single-input single-output (SISO) transfer function to an equivalent MxM matrix form.
Based on this transfer function matrix, along with a commutator at both the front and back
ends, an equivalent time-interleaved-by-M architecture can be derived.

2.3.1. Polyphase Decomposition


Polyphase decomposition is very popular in multirate DSP applications such as decimation
and interpolation filters, and Discrete Fourier Transform (DFT) filter banks. A detailed
polyphase decomposition technique was described in [34] and is summarized in brief here.
Let H(z) = ∞ −n
P
n=−∞ h(n)z represent the transfer function of a digital filter which can
be rewritten in the form:

H(z) = [.. + h(−2)z 2 + h(0) + h(2)z −2 + ..]

+z −1 [.. + h(−1)z 2 + h(1) + h(3)z −2 + ..] (2.4)

Essentially, equation 2.4 groups the impulse-response coefficients h(n) into even samples,
e0 (n) = h(2n) and odd samples, e1 (n) = h(2n + 1). If the z-transforms of e0 and e1 are E0 (z)
and E1 (z), respectively then:

P∞ P∞
E0 (z) = n=−∞ h(2n)z −n and E1 (z) = n=−∞ h(2n + 1)z −n

Thus, H(z) can be re-expressed as:

H(z) = E0 (z 2 ) + z −1 E1 (z 2 ) (2.5)

The quantities E0 (z) and E1 (z) are the polyphase components of H(z) and the representa-
tion in 2.5 is called the two-component polyphase decomposition of H(z). This decomposition
2.3. Time-interleaved ∆Σ Modulator 21

is valid for the case when H(z) is either a FIR or IIR filter. Also, it is possible to extend
H(z) to an M-component polyphase decomposition in the form:
M
X −1
H(z) = z −k Ek (z M ) (2.6)
k=0

where the polyphase components Ek (z) is defined as



X
Ek (z) = h(nM + k)z −n , 0≤k ≤M −1 (2.7)
n=−∞

Basically, the impulse-response coefficients h(n) have been divided into M groups and
Ek (z) are simply the M-fold decimated sequences of H(z). Here, H(z) is called a Type
1 polyphase decomposition and Ek (z) are called a Type 1 polyphase components. Type 2
polyphase decomposition is similar to Type 1, except that the components are renumbered:

M
X −1
H(z) = z −(M −1−k) Rk (z M ) where Rk (z) = EM −1−k (z) (2.8)
k=0
Type 1 and Type 2 decompositions are well suited for the design of decimation and
interpolation filters, respectively. Their implementations can be found in full detail in [34].

2.3.2. Block Digital Filtering


A block digital filter is a multirate system where parallelism is used to reduce the processing
speed of each element [23]. Consider an SISO linear time-invariant (LTI) system with transfer
function H(z) as shown in figure 2.7(a). According to [2], the digital blocked versions of length
M for the input u(n) and output v(n) are:

u(n) = [uM −1 (n), uM −2 (n), ...u0 (n)]T (2.9)

v(n) = [vM −1 (n), vM −2 (n), ...v0 (n)]T (2.10)

where uk (n) = u(nM + k) and vk (n) = v(nM + k) for (0 ≤ k ≤ M − 1). Equations 2.9 and
2.10 closely resemble the polyphase decomposition form in (2.7). In fact, their components
are M-fold decimated versions of u(n) and v(n).
Hence, the z-transform of the two vector-sequences are related by an MxM transfer matrix
H(z), i.e.:
V (z) = H(z)U (z) (2.11)
22 Chapter 2. Theoretical Background

Figure 2.7: (a) Scalar transfer function, (b) Time-interleaved-by-M version [2]

The MxM matrix H(z) is a blocked version of H(z) and its implementation is called block
digital filtering. Figure 2.7(b) depicts a time-interleaved-by-M version of a scalar system.

From [2], the general structure of H(z) is:


 
E0 (z) E1 (z) E2 (z) . . . EM −1 (z)
 
 −1
 z EM −1 (z) E0 (z) E1 (z) . . . EM −2 (z)


 
H(z) =  z EM −2 (z) z −1 EM −1 (z)
 −1
(2.12)

E0 (z) . . . EM −3 (z) 
.. .. .. ..
 
 
 . . . . 
 
z −1 E1 (z) z −1 E2 (z) z −1 E3 (z) . . . E0 (z)

 
H 11 H 12 H 13 . . . H 1M
 
 H 21 H 22 H 23 . . . H 2M
 

 
=  H 31 H 32 H 33 . . .
 
H 3M 
 .. .. .. ..
 

 . . . . 
 
HM1 HM2 HM3 . . . HMM
2.3. Time-interleaved ∆Σ Modulator 23

The elements in the first row of H(z) matrix are indeed the Type 1 polyphase components
of H(z) as defined in (2.6). Each element H ij corresponds to the contribution of the j th
input to the ith output. For example, H 12 would correspond to the contribution of the input
of path 2 to the output of path 1.
In the matrix H(z), each row is a circularly shifted version of the row above it except
for the elements below the diagonal entries which also contain a delay. This type of matrix
is called pseudo-circulant, which is a necessary condition for the block digital filter H(z) to
represent a SISO linear time-invariant transfer function H(z) [23].

2.3.3. Time-interleaved Error-Feedback ∆Σ Modulator


A detailed realization of a time-interleaved ∆Σ Modulator (TIM-DSM) was proposed in [2,
23]. Another TIM-DSM structure was proposed in [21, 35] that is hardware efficient for more
complex structures like CIFB and CIFF. In this work, the chosen ∆Σ modulator architecture
is error-feedback (EFB). Since this structure contains no integrators or accumulators as in
an OFB structure, the method described in [21, 35] does not result in any improvement over
that in [2, 23]. Hence, the time-interleaved (TIM) realization method described in [2, 23] is
used in this work. The time-interleaved version of an EFB structure based on block digital
filtering is shown in figure 2.8. Unless mentioned otherwise, the term TIM-DSM specifically
refers to the time-interleaved implementation of an EFB-DSM.
To illustrate the realization of a TIM-DSM, a first-order, single-bit, time-interleaved-by-2
(i.e.: M=2) modulator is used as an example. From (2.2), the loop filter for a first-order
EFB modulator is H(z) = z −1 . Using the two-component polyphase decomposition of H(z)
from (2.5), it was found that E0 (z) = 0 and E1 (z) = 1. Thus, by substituting E0 (z) and
E1 (z) into (2.12), the matrix blocked version of H(z) is:
 
0 1
H(z) =   (2.13)
−1
z 0

Since H ij corresponds to the contribution of the j th input to the ith output, the architec-
ture of a TIM-DSM can be realized as depicted in figure 2.9. Compared to the equivalent
OFB modulator in [2], the EFB structure has a lower level of circuit complexity and requires
24 Chapter 2. Theoretical Background

Figure 2.8: Linear model of time-interleaved error-feedback ∆Σ modulator

much less hardware. These savings would become even more significant for a higher order
modulator and for a higher time-interleaving factor, M.

Figure 2.9: First-order time-interleaved-by-two ∆Σ modulator

In summary, the steps to realize the architecture of any TIM-DSM are as follows:

1. Determine H(z) from equation (2.2)

2. Perform M-component polyphase decomposition of H(z) using equation (2.6) and (2.7)

3. Substitute Ek (z) into equation (2.12) to find the MxM matrix H(z)

4. Use the relations H ij to realize the feedback filters in a TIM architecture


2.3. Time-interleaved ∆Σ Modulator 25

2.3.4. Practical Considerations


A main issue to be considered in the design of a TIM-DSM is the critical path. In the
previous example, the critical path consists of two multi-bit adders and two one-bit adders,
whereas in the non-TIM case, the critical path contains half the number of components.
Clearly, if the critical path slows down the TIM-DSM by a factor of 2, this would totally
defeat the purpose of time-interleaving in the first place. However, clever design techniques
can eliminate this problem to a great extent.
One approach to reduce the critical path was proposed in [36]; it uses a vector quantizer
to form two parallel matrix transformations that move intensive computations outside the
feedback loop. This technique has a high level of circuit complexity and requires a large
amount of hardware. Also, the number of unique comparisons required in the decision
circuitry grows exponentially with the time-interleaving factor which practically limits the
level of parallelism. Thus, for a design that requires a large number of parallel paths, this
technique may not result in a significant critical path delay improvement.
Another approach is to pipeline the adders such that each adder operates as soon as it
receives the LSB. The timing analysis in [2] shows that when using ripple-carry adders, the
delay in this example is about 12% higher than that of a non-TIM case. The critical timing
is expected to improve for even faster adder architectures (e.g.: carry-save, carry-select, or
carry-lookahead adders) as long as they are realized in a pipelined fashion.
In this work, both pipelined carry-save and carry-select adders are suitable with compa-
rable critical paths. To add multiple numbers together, a carry-select adder tree results in
a shorter tree depth than that of a carry-save adder. Also, at the last stage of carry-save
adder tree, a ripple-carry adder is required, which can result in a long critical path for a large
word length (> 10 bits). A hybrid solution is to use carry-save adders throughout the sum
tree, except for the last stage where carry-select adder is used. For simplicity, only pipelined
carry-select adders are used in this work.
26 Chapter 2. Theoretical Background

2.4. Time-interleaved Interpolation Filter


As mentioned earlier, an interpolation filter (IF) is required before the DSM. It consists of
an OSR-fold interpolator and a lowpass filter with a band edge of (π/OSR) as shown in
figure 2.10. The interpolator increases the input sampling frequency by a factor of OSR by
inserting (OSR − 1) zeros between adjacent samples of X(z). The subsequent filter G(z)
eliminates all unwanted images of the input signal.

Figure 2.10: Conventional interpolation filter

For a conventional IF followed by a TIM-DSM, the input is first upsampled by the inter-
polator then downsampled by the DSM’s input demultiplexer. By applying the same digital
block filtering technique on G(z), the IF can also be transformed into a time-interleaved IF
(TIM-IF). If the time-interleaving factor for the IF is same as that of the DSM (i.e.: M),
no multiplexer/demultiplexer is needed between them. Furthermore, if M is chosen to be
the same as OSR, the upsampler preceding the IF is also eliminated. Figure 2.11 shows the
TIM-IF integrated together with the TIM-DSM. Here, all sub-blocks operate at a sampling
rate of fN · ( OSR
M
), except the output multiplexer which operates at a sampling rate of fS
(i.e.: fN · OSR).

Note that the upsampler preceding G(z) in a conventional IF ensures that only one of
every M inputs is nonzero. This simplifies the G(z) of a TIM-IF from an MxM matrix down
to an Mx1 matrix. In other words, this reduces the required number of elements of G(z)
from M 2 down to M . Thus, only the first column of G(z) is implemented, resulting in a
TIM-IF having the same complexity as a conventional IF but operating at (1/M )th the rate
[2]. It should also be noted that unlike the case of a TIM-DSM where the internal paths are
interconnected, the paths of TIM-IF are independent of each other. Using the same steps to
2.5. Summary 27

Figure 2.11: Time-interleaved interpolation filter and time-interleaved ∆Σ modulator

realize a TIM structure from the previous section, an Mx1 matrix for a TIM-IF is given by:
 
I0 (z)
 
 −1
 z IM −1 (z) 

 
G(z) =  z −1 IM −2 (z)  (2.14)
 
..
 
 
 . 
 
−1
z I1 (z)

where the polyphase components Ik (z) is defined as:



X
Ik (z) = g(nM + k)z −n , 0≤k ≤M −1 (2.15)
n=−∞

2.5. Summary
In general, this chapter gave a brief overview of ∆Σ-DAC architectures. Particularly, the TIM
∆Σ-DAC is of great interest since it combines the well-known benefits of a ∆Σ modulator
and the potential for wider bandwidth of a parallel structure. The IF can also be time-
interleaved to simplify the overall integrated TIM ∆Σ-DAC design while resulting in no
additional hardware complexity.
28 Chapter 2. Theoretical Background
Chapter 3
Time-interleaved ∆Σ-DAC Design

This chapter discusses the architectural design of a time-interleaved (TIM) ∆Σ-DAC. The
digital front-end of a TIM ∆Σ-DAC contains a time-interleaved interpolation filter (TIM-IF)
and a time-interleaved ∆Σ modulator (TIM-DSM). The analog back-end of a TIM ∆Σ-DAC
contains a DAC and an analog reconstruction filter. Specific details of these sub-blocks are
discussed in the order of which they appear in the system.

3.1. Architecture Overview


As mentioned earlier, the motivation of this work is to push ∆Σ-DAC to higher speeds and
to accommodate the design challenges of deep sub-micron processes. The core of this design
is based on a time-interleaving architecture to meet the high data rate, wide bandwidth
requirements of 60GHz or UWB applications.

From table 1.1, since the design targets for an ENOB around 9 bits, this corresponds to
an accuracy (SNR) and linearity (SFDR) performance of approximately 56 dB. Note that
SNR is the ratio of the fundamental signal power to the inband noise power, but it does not
account for harmonic distortion. The parameter that accounts for both noise and distortion
is called the SNDR (Signal to Noise plus Distortion Ratio), which is often less than that of
SNR. The design targets in table 1.1 are conservative but the top-level design aims for even

29
30 Chapter 3. Time-interleaved ∆Σ-DAC Design

higher performance to give extra margins.


From chapter 2, if the time-interleaving factor M is same as the OSR then there is no
need for an input interpolator. Hence, both M and the OSR are chosen to be 8, which is
reasonable in terms of hardware complexity and digital circuit speed as will be shown later
in chapter 4. Figure 3.1 shows the block diagram of a time-interleaved-by-8 (TIM8) ∆Σ-
DAC. Here, the digital front-end operates at fN · (OSR/M ) = fN while the analog back-end
operates at fN · (OSR) = fS , which correspond to 500MS/s and 4GS/s, respectively.

Figure 3.1: Time-interleaved-by-8 ∆Σ-DAC block diagram

3.2. Time-interleaved Interpolation Filter


Based on a built-in digital filter function and Filter Visualization Tool in Matlab, a multirate
filter was designed. From the specifications in table 1.1, the interpolation filter, G(z), is
required to have an interpolation factor of 8 and a bandwidth of 250MHz.
In a practical design, an ideal or “brick wall” filter is not implementable since its impulse
response is infinite and non-causal. To create a finite-duration impulse response, this filter
is truncated by applying a window. By retaining the central section of the impulse response,
a linear phase finite impulse response (FIR) filter can be obtained. There are different
3.2. Time-interleaved Interpolation Filter 31

types of windowing (e.g.: Kaiser, Blackman-Harris, Hamming, Gaussian, etc) which have
different trade-offs depending on the design. A long polyphase FIR length gives a high cutoff
frequency (i.e.: wide bandwidth) and high attenuation but also has a high implementation
complexity. In this application, Kaiser windowing (with α=0.5 by default) gives the optimal
trade-offs in terms of bandwidth, attenuation and complexity.

To design an FIR interpolation filter (IF), Kaiser windowing requires a polyphase length
(pl) and a stopband attenuation (αs , in dB). The IF cutoff becomes sharper with higher pl
to a point when pl is large enough such that increasing it further only results in a small
improvement. Note that only when pl → ∞ does the IF become ideal. In addition, large
pl results in an impractical implementation due to the large number of coefficients (i.e.:
filter polyphase terms or filter taps). Through simulations, pl is chosen to be 96, which
corresponds to a 95th order FIR filter. On the other hand, increasing αs gives higher out-of-
band attenuation hence reducing the analog filter’s attenuation requirement. However, this
significantly reduces the IF roll-off rate. Since large out-of-band truncation noise is added by
the DSM after the IF, having a large αs does not give a significant benefit. Thus, αs = 40dB
is found to be sufficient for this design.

Figure 3.2 shows different responses of the IF with and without quantization of the 96 co-
efficients. In simulations, the IF coefficients are first obtained from the “windowed” impulse
response with full-precision. In an actual implementation, these coefficients are rounded-
off due to the fixed-length multipliers. For a large digital system, multipliers occupy large
area and slow down the operating speed. However, if the coefficients are quantized using
canonic sign digit (CSD) representation where they are represented as sums or differences
of power-of-2, only adders and subtractors are required. This eliminates the need for digital
multipliers and ultimately results in an effective and robust digital implementation as long
as the discrepancies are reasonably acceptable. In this work, the quantization algorithm
allocates one and two CSD terms for coefficients with magnitude < |0.1| and > |0.1|, respec-
tively. Indepth details on the multiplierless IF hardware implementation will be discussed
in chapter 4.

In these responses, there is little discrepancy between the ideal and quantized IF in the
passband. Outside the passband, especially after 2 · fB , the attenuation of the quantized
32 Chapter 3. Time-interleaved ∆Σ-DAC Design

(a) Frequency Response (b) Passband Ripple

Figure 3.2: A 95th -order FIR interpolation filter with and without coefficient quantization

IF degrades significantly. However, this is acceptable since within the critical band (fB to
2 · fB ), the attenuation is still around 40dB as intended. After this band, the truncation
noise becomes dominant hence a reduction in stopband attenuation does not cause too much
damage. Nevertheless, the quantized IF attenuates all images by at least 28dB over the
entire stopband which still helps relaxing the analog filter’s attenuation requirement.
From figure 3.2(b), the -3dB bandwidth is around 235MHz. The passband ripple is
approximately 0.2dB, which is quite acceptable. Table 3.1 summarizes the IF design method
as well as its performance.

Table 3.1: Interpolation Filter Characteristics


Parameter Description

Window Kaiser
Design Polyphase length (pl) 96
Filter order (l) 95
Stopband attenuation (αs ) 40 dB
Performance -3dB Bandwidth (BW−3dB ) 235 MHz
(Quantized IF) Passband ripple 0.2 dB
Stopband attenuation ≥ 28 dB
3.2. Time-interleaved Interpolation Filter 33

Thus, the IF is a 95th order FIR filter, G(z), of the following form:
l=95
X
G(z) = z −k g(k) = g(0) + g(1)z −1 + g(2)z −2 + · · · + g(n)z −l (3.1)
k=0

where g(k) are the filter coefficients.


From section 2.3, the 8-component polyphase decomposition of G(z) is expressed as:
7
X
G(z) = z −k Ik (z 8 ) = I0 (z 8 ) + z −1 I1 (z 8 ) + · · · + z −7 I7 (z 8 ) (3.2)
k=0

where the polyphase components Ik (z) are defined as:


l+1
8
−1
X
Ik (z) = g(8i + k)z −(8i) , 0≤k≤7 (3.3)
i=0

That is:

I0 (z) = g(0) + g(8)z −8 + g(16)z −16 + · · · + g(88)z −88

I1 (z) = g(1) + g(9)z −8 + g(17)z −16 + · · · + g(89)z −88


..
.

I7 (z) = g(7) + g(15)z −8 + g(23)z −16 + · · · + g(95)z −88

Based on Ik (z) and equation 2.14 from section 2.4, G(z) of a TIM-IF is given by:
 
I0 (z)
 
 −1
 z I7 (z) 

 
G(z) =  z −1 I6 (z)  (3.4)
 
..
 
 
 . 
 
z −1 I1 (z)

Figure 3.2 shows the realization of a TIM-IF based on the conventional IF for this work.
Notice that aside from the first path (U1 ), all subsequent paths (U2 − U8 ) are in reverse order
of the polyphase components (I7 (z) − I1 (z)).
34 Chapter 3. Time-interleaved ∆Σ-DAC Design

(a) Conventional IF

(b) Time-interleaved-by-8 IF

Figure 3.3: A 95th -order FIR interpolation filter block diagram

3.3. Time-interleaved ∆Σ Modulator

3.3.1. DSM Architecture

As mentioned in section 2.2.1, the main parameters that control the SNR performance are:
the OSR, modulator order (m) and number of truncator bits (k ). Based on these parameters,
the maximum SNR can be estimated according to the following calculations.

Let the input signal be a sinusoidal wave. Its full-swing amplitude, A, is defined as
3.3. Time-interleaved ∆Σ Modulator 35

2k (∆/2) where ∆ is the unit quantization step size (or 1 LSB - least significant bit) and k is
the number of truncator bits. Hence, the signal power, Ps , is given by ([37], Ch.14):
2
A2 2k ∆ 22k ∆2

Ps = = √ = (3.5)
2 2 2 8

The noise power, Pe , for an mth -order DSM is given by:


 2   2m   
∆ π 1
Pe = (3.6)
12 2m + 1 OSR2m+1

Thus, the maximum SNR (in dB) is given by:

3(2m + 1)22k−1
   
Ps 2m+1
SN Rmax = 10log = 10log (OSR) (3.7)
Pe π 2m

According to table 1.1, the OSR is 8. This leaves only m and k to be determined. From
the stability analysis of an EFB-DSM in section 2.2, the modulator order should be at least
1 less than the number of truncator bits (i.e.: m ≤ k − 1). Based on equation 3.7, choosing
m=3 and k=4 results in an SNR of 68dB which allows sufficient design margin beyond the
target of 56dB.

3.3.2. NTF Optimization


For a 3rd -order DSM, the conventional noise transfer function (NTF) is:

N T F (z)conv = (1 − z −1 )3 (3.8)

in which, all zeros and poles are located at z=1 and z=0, respectively.
According to ([1], Ch. 4), significant improvement in SNR can be obtained by optimizing
the NTF zero locations. By spreading the zeros along the z-domain unit circle, the total
inband noise power can be reduced. The optimal NTF zeros can be found by equating the
partial derivatives of the noise power to zero. The mathematical derivations are not discussed
here and the optimization is done using a built-in function in Richard Schreier’s Delta-Sigma
Toolbox [38].
Although, moving the poles closer to the zeros reduces the out-of-band (OBG) gain results
in improved stability, this was not done here. As discussed in section 2.2, using stability
36 Chapter 3. Time-interleaved ∆Σ-DAC Design

criterion in [33], the stability of an EFB system can be maintained while tolerating much
higher OBG than an OFB system. Thus, there was no need for NTF pole optimization.

Figure 3.4(a) shows the pole-zero plot of the optimized NTF for a 3rd -order DSM. Opti-
q √
3/5·f
mizing the NTF zeros results in a notch at DC and another one at 35 · fB or 2·OSRS ([1],
Ch. 4). This improves the SNR by 8dB compared to the case where all zeros are at DC.
Similar to the TIM-IF, the NTF coefficients must be quantized using CSD representation
for digital realization. Since there are only 3 taps for a 3rd -order DSM, large discrepancies
between quantized and non-quantized coefficients degrade the SNR performance. Hence, the
quantized NTF coefficients should be close to the optimized NTF value by utilizing more
CSD terms. Here, they are represented by 3 CSD terms as given below.

Thus, the optimized NTF becomes:

N T Fopt (z) = (1 − z −1 )(1 − 1.908z −1 + z −2 ) (3.9)

and the quantized optimized NTF is:

N T Fquan (z) = (1 − z −1 )(1 − 1.875z −1 + z −2 ) (3.10)

For this NTF(z), the feedback loop filter, H(z), is:

H(z) = 1 − N T Fquan (z) = az −1 − az −2 + z −3 (3.11)

where a = 2.875 = 21 + 20 − 2−3 .

Figure 3.4(b) overlays the response of all NTF versions. The quantization results in a
slight degradation of inband noise shaping and a shift in notch location closer to the band
edge. However, these have a small impact on the SNR performance which will be quantified
in a later section.

Figure 3.5 shows the Matlab model of a 3rd -order ∆Σ-DAC with conventional IF and
conventional DSM. Here, the 10-bit quantizer at the input generates a 10-bit digital stream
while the 4-bit quantizer near the DAC represents the 4-bit truncator with digital limiter.
3.3. Time-interleaved ∆Σ Modulator 37

(a) Optimized NTF Pole-Zero Plot (b) NTF Frequency Response

Figure 3.4: ∆Σ modulator noise transfer function optimization

Figure 3.5: Conventional 3rd -order error-feedback ∆Σ modulator architecture

3.3.3. Time-interleaved DSM

Using the steps from section 2.3, a conventional DSM can be transformed into a TIM-DSM.
Similar to the TIM-IF, the 8-component polyphase decomposition of H(z) from 3.11 is:

7
X
H(z) = z −k Ek (z 8 ) = E0 (z 8 ) + z −1 E1 (z 8 ) + · · · + z −7 E7 (z 8 ) (3.12)
k=0
38 Chapter 3. Time-interleaved ∆Σ-DAC Design

where the polyphase components Ek (z) are defined as:

E0 (z) = 0 E4 (z) = 0

E1 (z) = a E5 (z) = 0

E2 (z) = −a E6 (z) = 0

E3 (z) = 1 E7 (z) = 0

(3.13)

Next, substitute the above polyphase components Ek (z) into equation (2.12) to get:
 
E0 (z) E1 (z) E2 (z) . . . E7 (z)
 
 −1
 z E7 (z) E0 (z) E1 (z) . . . E6 (z) 

 
H(z) =  z −1 E6 (z) z −1 E7 (z) (3.14)
 
E0 (z) . . . E5 (z) 
.. .. .. .. 
 

 . . . . 
 
z −1 E1 (z) z −1 E2 (z) z −1 E3 (z) . . . E0 (z)

 
0 a −a 1 0 0 0 0
 
0 0 a −a 1 0 0 0 
 

 
0 0 0 a −a 1 0 0 
 

 
a −a
 
 0 0 0 0 1 0 
=
 
a −a

 0 0 0 0 0 1 
 
 z −1 a −a 
 
0 0 0 0 0
 
 −az
 −1 −1 
z 0 0 0 0 0 a 
 
−1 −1 −1
az −az z 0 0 0 0 0

Lastly, using the relation H ij which corresponds to the contribution of the j th input to
the ith output, the architecture of a TIM-DSM for this work can be realized as depicted in
figure 3.6.
3.3. Time-interleaved ∆Σ Modulator 39

Figure 3.6: Time-interleaved-by-8 3rd -order error feedback ∆Σ modulator


40 Chapter 3. Time-interleaved ∆Σ-DAC Design

3.3.4. TIM-DSM Performance


This section presents the architectural simulation results for the digital front-end which
contains both TIM-IF and TIM-DSM. While the TIM-IF corresponds to a 95th -order FIR
interpolation filter, the TIM-DSM corresponds to a 3rd -order, 4-bit ∆Σ modulator. The
results are obtained at the output of the 8-to-1 multiplexer (i.e.: V(z) in figure 3.1) and
assumed to be filtered by a “brick wall” lowpass filter. The coefficients of both TIM-IF and
TIM-DSM are quantized using CSD representation as discussed earlier. For simplicity, the
term “TIM-IF-DSM” refers to the integration of the TIM-IF and TIM-DSM, excluding the
1
DAC. Recall that since the OSR equals 8, this corresponds to fB = f
16 S
= 250M Hz.
Figure 3.7 shows the response of a time-interleaved DSM (in figure 3.6) versus a con-
ventional DSM (in figure 3.5). Figure 3.7(a) shows the SNR versus input amplitude for a
tone at 0.25fB . Figure 3.7(b) shows the SNR versus input frequency (normalized to fB ) at
0dBFS amplitude. These figures show identical responses, implying that the time-interleaved
system is indeed equivalent to the conventional one.

(a) SNR vs. Input amplitude (b) SNR vs. Input frequency

Figure 3.7: TIM-IF-DSM versus conventional DSM response (Matlab simulations)

Figure 3.8 shows the TIM-IF-DSM output SNR and SNDR versus input amplitude
for a single tone at 0.25fB for non-optimized (N T F ), optimized (N T Fopt ) and quantized
3.3. Time-interleaved ∆Σ Modulator 41

(N T Fquan ) optimized NTF. In these simulations, the input is quantized to 10 bits, but in-
ternal computations are performed with full precision even when the TIM-IF and TIM-DSM
coefficients are quantized. As discussed earlier, this figure shows an SNR improvement of
8dB between N T F and N T Fopt . Compared to the N T Fopt , the N T Fquan shows a 2dB in
SNR degradation but less than 1dB in SNDR degradation. This implies the N T Fquan is an
acceptable design.

(a) SNR vs. Input amplitude (b) SNDR vs. Input amplitude

Figure 3.8: TIM-IF-DSM performance for a single tone at 0.25fB for non-optimized, opti-
mized and quantized optimized NTF (Matlab simulations)

Figure 3.9 shows the TIM-IF-DSM output spectrum for different input frequencies, rang-
ing from 0.13fB to 0.93fB . For input frequencies below 0.33fB , the odd harmonics, caused
by truncation error, show up as inband tones while above this frequency, the odd harmonics
are out of band. Although the harmonics are less of a concern for high-frequency inputs, the
output amplitude is attenuated due to the band limitation of both practical digital IF and
analog filter. In general, while the low-frequency degradation is dominated by the inband
harmonics, the high-frequency degradation is dominated by the IF filter bandwidth.
Figure 3.10(a) shows the TIM-IF-DSM performance versus input amplitude for an input
tone at 0.25fB . The SNDR degrades by approximately 4dB with respect to that of SNR
strictly due to inband harmonics. On the other hand, figure 3.10(b) shows the TIM-IF-DSM
42 Chapter 3. Time-interleaved ∆Σ-DAC Design

(a) (b)

(c) (d)

Figure 3.9: TIM-IF-DSM output spectrum for Matlab simulations with 0dBFS input ampli-
tude at different frequencies a) 0.13fB b) 0.25fB c) 0.50fB d) 0.93fB
3.4. Digital-to-Analog Converter Model 43

(a) SNR and SNDR vs. Input amplitude (b) SNR and SNDR vs. Input frequency

Figure 3.10: TIM-IF-DSM response (Matlab simulations)

performance versus input frequency for an input amplitude of 0dBFS. It shows that the
SNDR degradation compared to SNR is only prominent for input frequencies below 0.33fB
(where the 3rd harmonic falls inband). For higher frequencies, the SNDR is identical to
SNR which remains around 60dB up to 0.8fB ; after which, it starts to degrade due to the
dominance of the TIM-IF’s frequency response (in figure 3.2(b)). Also, figure 3.10(b) shows a
performance of at least 8.8 bits for the entire input frequency band, which is quite acceptable
for the targeted applications of this work.

3.4. Digital-to-Analog Converter Model


The multiplexed output of the TIM-IF-DSM is fed into a DAC for digital-to-analog conver-
sion. For a high-speed application, a current-steering DAC (CS-DAC) is a popular choice
where each unit cell switches a current to either output or ground. The switches are controlled
by thermometer codes generated by passing the TIM-IF-DSM output through a binary-to-
thermometer (B2T) converter. A thermometer-based CS-DAC has many advantages over
its binary counterpart, such as low non-linearity errors, guaranteed monotonicity, and low
glitching noise ([37], Ch.12).
44 Chapter 3. Time-interleaved ∆Σ-DAC Design

A thermometer-based CS-DAC consists of 2k − 1 unit cells. Non-linearities, which arise


due to mismatches between unit cells, generate inband harmonics and increase the noise
floor due to the folding of high-frequency truncation noise into the signal band ([1], Ch.6).
Consequently, SNR, SNDR and ENOB are all degraded. Figure B.4 in appendix B shows
these degradations for different DAC element mismatches (i.e.: 1% − 4%).
An effective strategy to eliminate spurious harmonics and lower the noise floor is to use
mismatch error shaping. There are many techniques to achieve this but the most common
ones are: DWA (data-weighted averaging), ILA (individual level averaging), vector-based
mismatch shaping, butterfly shuffler, and tree-structure element selection. Theoretical anal-
yses and system architectures for some of these techniques are well presented in [1, 39].
Among these multi-bit DAC linearization techniques, DWA and current calibration are
the most common ones. There are several different DWA schemes, all of which basically rely
on a high OSR to rotate the DAC unit elements so that their average long-term usage is the
same. However, the effectiveness of DWA degrades dramatically at low OSRs. Furthermore,
while most DWA schemes are conceptually simple, their implementations are quite complex,
especially for a high number (> 8) of DAC levels [24]. Also, in this work, the large DWA
circuitry would have to operate at 4GHz, which may not be feasible and consume a large
amount of power.
On the other hand, current calibration linearizes the multi-bit DAC by dynamically
matching its unit current elements. This technique has been used to realize very well-matched
current sources (e.g.: up to 16-bit accuracy in [40]). In addition, the calibration circuitry
is more straightforward and suitable for high-speed operation. Thus, current calibration is
chosen to alleviate the degradations caused by multi-bit DAC element mismatches.
Unlike most linearization techniques which can be modeled accurately in Matlab, it is
quite challenging to model and quantify a current calibration system. Hence, for simplicity,
the DAC model this architectural-level is assumed to be ideal and mismatch-free. The
circuit-level performance of this multi-bit CS-DAC with current calibration will be discussed
in chapter 4.
3.5. Analog Reconstruction Filter 45

3.5. Analog Reconstruction Filter


The last block of a TIM ∆Σ-DAC is an analog low-pass filter (LPF). The purpose of this
filter is to suppress the out-of-band truncation noise, leaving only the signal spectrum within
the band of interest. The analog filter’s implementation is out of the scope of this work and
will not be integrated in the final fabrication. However, one possible design is discussed in
Appendix B as an example.

3.6. Summary
In summary, this chapter presents the design details of each sub-block in a TIM ∆Σ-DAC.
Figure 3.11 shows a complete Matlab model of this DAC. Note that the parallel paths must
be multiplexed in reverse order to generate a correct output. Architectural-level simulation
results are presented together with design trade offs and decisions. All sub-blocks, except
for the analog filter (TIM-IF, TIM-DSM, MUX, and CS-DAC) will be integrated in STMi-
croelectronics 90nm CMOS process.
46 Chapter 3. Time-interleaved ∆Σ-DAC Design

Figure 3.11: Time-interleaved-by-8 ∆Σ-DAC architecture


Chapter 4
Time-interleaved ∆Σ-DAC Implementation

This chapter discusses the physical implementation of a TIM ∆Σ-DAC fabricated using
STMicroelectronics 90nm CMOS process. It consists of 3 parts as depicted in figure 4.1(b): a
digital baseband front-end, a high-speed digital interface, and a high-speed analog back-end.
Unlike the conventional ∆Σ-DAC in figure 4.1(a) which operates entirely at fS = OSR · fN ,
only the interface and analog section of the TIM ∆Σ-DAC operate at this speed, while the
main digital portion operates at fN .

Figure 4.1: a) Conventional ∆Σ-DAC b) Time-interleaved-by-8 ∆Σ-DAC

47
48 Chapter 4. Time-interleaved ∆Σ-DAC Implementation

4.1. Digital Baseband Front-End


The digital baseband front-end is the largest block of the TIM ∆Σ-DAC. It consists of two
sub-blocks, a TIM-by-8 IF and a TIM-by-8 DSM, that operate at the same speed as the
input sampling frequency (i.e.: 500MS/s). Before discussing the implementation of these
sub-blocks, three different optimization techniques are presented for hardware, accuracy,
and speed. They reduce the hardware complexity, finite word-length inaccuracies, and prop-
agation delay of this design, respectively.

4.1.1. Hardware Optimization

Multiplierless Implementation

In a practical implementation of a fixed-point digital filter, its coefficients must first be


quantized using a power-of-2 representation (e.g: 2’s complement) with a fixed word length.
The filter requires one multiplier for each of its coefficients. For a high order filter (i.e.:
95th -order TIM-IF), this results in a huge amount of hardware and power consumption, and
a long critical path which limits the operating speed. A general purpose multiplier assumes
all bits in both the multiplier and multiplicand may change during operation. However, since
the coefficients’ binary representations are known prior to implementation, multiplication by
a constant is equivalent to a combination of binary shifts and additions of only the active
bits.
For example, A × B where B = 0.11102 can be implemented using only 3 shifters and
2 adders, instead of a 4-bit multiplier, as: (A >> 1) + (A >> 2) + (A >> 3) (where >>
denotes a right shift).
Implementing each filter coefficient using this technique reduces the amount of hardware
significantly, thus resulting in lower power consumption and higher operating speed.

Canonic Sign Digit Representation

The multiplierless technique makes hardware complexity proportional to the number of non-
zero bits (i.e.: logic 1’s) in the filter coefficients. For a further optimization, a canonic sign
4.1. Digital Baseband Front-End 49

digit (CSD) representation can be used where the constant coefficients are represented using
the fewest possible number of non-zero bits. It is a signed power-of-2 representation, in which

each bit is in the set 0, 1, 1 (where 1 = −1) [41]. Here, the coefficients are represented as
sums or differences of the fewest possible power-of-2 terms.
For the above example, by converting B = 0.11102 to B = 1.00102 , A × B can be
implemented using only 1 shifter and 1 adder as: A − (A >> 3).
Compared to a binary representation, CSD results in further hardware reduction due to
a fewer number of shifters and adders required.

4.1.2. Accuracy Optimization


The accuracy of a digital filters is limited by the finite word length arithmetic operations.
Three sources of error due to the finite word length are [42]:

1. the quantization of the input signal,

2. the quantization of the filter coefficients, and

3. the accumulation of roundoff errors during arithmetic operations.

Since the input to the TIM ∆Σ-DAC already has a fixed word-length (10 bits), the input
quantization error is not applicable in this work. The remaining two sources of errors will
be considered in this section.

Optimized CSD Representation of the Filter Coefficients

In this work, one of the major challenges is the physical implementation of the 95th -order IF.
Due to its high order, coefficient quantization can have a significant effect on its stopband
attenuation. Fortunately, since the IF in this work is integrated with a “noise-shaping”
DSM, some degradation in the IF’s stopband attenuation can be tolerated. The out-of-band
noise will be dominated by a large amount of shaped truncation noise introduced after the
TIM-DSM. Hence, the IF implementation is acceptable as long as it preserves the passband
response while providing a reasonable amount of attenuation in the stopband, as shown
previously in figure 3.2.
50 Chapter 4. Time-interleaved ∆Σ-DAC Implementation

To reduce the coefficients’ quantization errors while maintaining a practical implementa-


tion, a rule of thumb proposed in [43] is used as the basis to optimize their CSD representa-
tions:

• One nonzero digit in the CSD code is typically required for each 20dB of stopband
attenuation in the filter specifications.

Recall from the IF design in chapter 3, the filter’s stopband attenuation was 40dB. Thus,
two nonzero CSD digits are generally used to represent each filter coefficient.

Roundoff Errors Reduction Scheme

Roundoff errors are inevitable in fixed-length digital operations. There has been much re-
search to reduce these deterministic errors. In [44], an adaptive carry generation circuitry,
based on an exhaustive simulation or statistical analysis, is used to approximate the roundoff
errors being compensated. Inspired by this idea of carry compensation, the rounding scheme
in this work uses both an exact and an approximate carry as shown in figure 4.2.

Figure 4.2: Error reduction rounding scheme

Specifically, to obtain a y − bit output from the sum of x − bit inputs (where x > y), all
computations are done using (y + 1) bits, where the extra bit represents the exact carry. To
account for the truncated (x-y-1) bits, the MSB of this portion is added to the (y + 1)-bit
sum; this MSB represents the approximate carry. Finally, the (y + 1)-bit sum is truncated to
y − bit output at the last stage. In the example below, three 8-bit numbers are to be added
then truncated to form a 4-bit sum.
4.1. Digital Baseband Front-End 51

The correct result is approximately 10.8 in a decimal representation, where the 4-bit
truncation is included by multiplying by 2−4 . The first truncation method, which does not
include any rounding, results in a largest error (∆=1.8). The second truncation method,
which includes a 1-bit approximate carry, results in a nominal error (∆=0.8). Lastly, the
proposed truncation method, which includes a 1-bit exact and 1-bit approximate carry,
results in a smallest error (∆=0.2).
Note that with more exact-carry bits, even higher accuracy can be achieved. However,
this would degrade the speed and increase the area for a small improvement in accuracy. For
this design, a 1-bit exact carry and 1-bit approximate carry give a good trade-off between
these design considerations.

4.1.3. Speed Optimization

Parallel Adder Architecture

As mentioned in chapter 2, pipelined or parallel adders are required in this design to minimize
the critical path delay. Many different adder architectures are possible; the best choice
depends on the specific design. For example, a ripple carry adder (RCA) has the smallest
52 Chapter 4. Time-interleaved ∆Σ-DAC Implementation

area and lowest power but also slowest speed. On the other hand, a carry look-ahead adder
(CLA) has the fastest speed but its power consumption is relatively high. A carry select
adder (CSA) is a compromise between the high-speed operation of the CLAs and the low-
power consumption of the RCAs ([45], Ch. 7). Thus, the CSA architecture is used in this
work.
Figure 4.3 shows the architecture of a CSA, which consists of two full adders (FAs) for
each bit’s addition: one FA assumes the carry-in (Cin ) is ’1’ while the other assumes the Cin
is ‘0’. The FAs are grouped into “stages”, each of which is a RCA. At each stage, the Cin is
obtained from the previous stage, except for the first stage where Cin is an input. This Cin
selects one of the two sums, and one of the two carries, through simple 2-to-1 multiplexers.

Figure 4.3: Example of an 8-bit CSA with 1-1-1-2-3 staging

CSA Staging Optimization

The critical path, and hence the maximum operating speed of the CSA depends to a great
extent on the number of bits allocated to each stage. For example, a staging of (4-4-4-4-4-4-
4-4) for a 32-bit adder does not result in the maximum speed due to the multiplexing delay
of the carry path([45], Ch. 7). The optimal CSA staging depends on the specific technol-
ogy and adder word-length. Table 4.1 shows the CSA staging that results in the shortest
critical path for different CSA lengths. It also shows the estimated and synthesized delay
using 90nm CMOS standard-Vt digital libraries (CORE90GPSVT and CORX90GPSVT). A
4.1. Digital Baseband Front-End 53

combination of RCAs for low-bit adders (4-5 bits) and CSAs for medium to high-bit adders
can be used for further speed optimization. However, for simplicity, CSAs are used for all
bit adders. The CSA delay can be estimated as:

tCSA = (# of stages − 1) × tM U X + (max # of F As per stage) × tF A (4.1)

For example, for a 8-bit CSA, which has 1-1-1-2-3 staging, tCSA = 4(tM U X ) + 3(tF A ).

Table 4.1: CSA Staging Optimization


CSA Bits Staging # of Stages Estimated tCSA Synthesized tCSA (ns)

4 1-1-2 3 2(tM U X ) + 2(tF A ) 0.20


5 1-1-1-2 4 3(tM U X ) + 2(tF A ) 0.22
6 1-1-1-1-2 5 4(tM U X ) + 2(tF A ) 0.26
7 1-1-1-2-2 5 4(tM U X ) + 2(tF A ) 0.29
8 1-1-1-2-3 5 4(tM U X ) + 3(tF A ) 0.30
9 1-1-1-2-2-2 6 5(tM U X ) + 2(tF A ) 0.28
10 1-1-1-2-2-3 6 5(tM U X ) + 3(tF A ) 0.30
11 1-1-1-2-2-2-2 7 6(tM U X ) + 2(tF A ) 0.31
12 1-1-1-2-2-2-3 7 6(tM U X ) + 3(tF A ) 0.35

4.1.4. Time-interleaved Interpolation Filter


This section presents the physical implementation of the TIM-IF which was designed in
section 3.2. For a 95th -order IF with a time-interleaving factor of 8, this corresponds to 8
parallel paths with 12 coefficients/path. For simplicity, the coefficient notations in each local
path are referred to as g 0 (n), instead of the global g(n) as shown in figure 3.3. Table C.1 in
appendix C lists the coefficient values for all TIM-IF paths in both original and quantized
CSD representation.
Figure 4.4 shows the physical implementation of the TIM-IF. Here, the D-flipflop (DFF)
represents a delay while the “sum tree” represents the summation of all coefficients for each
path. Recall from chapter 3 that all TIM-IF paths, except path 1 (U1 ), are in reverse order;
54 Chapter 4. Time-interleaved ∆Σ-DAC Implementation

Figure 4.4: TIM-IF Physical Implementation

for example, sum tree 8 actually belongs to path 2 and so on. For an N-bit input, the word-
length at the output of the TIM-IF should be (N+1) bits to account for digital arithmetic
overflow. This overflow bit is also shared with the TIM-DSM. Thus, it is not necessary to
provide another overflow bit for the TIM-DSM which never overflow through the use of a
digital limiter. In this work, the input and output are 10 and 11 bits, respectively.

For each path, there are 12 coefficients with each being represented by 1 to 3 CSD terms.
Thus, there are many CSD terms and summing operations involved for each path. This
makes the fixed word-length output (11 bits) a challenging task. To meet the fixed word-
length requirement and also to minimize the delay, a custom“sum tree” is created for each
TIM-IF path. Notice that the coefficients for paths 2 & 8 are the same but in reverse order,
thus the same sum tree design can be used. The same applies to paths 3 & 7 and paths 4 &
6. Figure 4.5 shows an example of a sum tree for path 2 & 8; the sum trees for the remaining
paths can be found in appendix C.

All sum trees use the CSA described in section 4.1.3. Binary sign extension is used
4.1. Digital Baseband Front-End 55

Figure 4.5: TIM-IF sum tree for path 2 and 8

whenever needed to ensure that both inputs to each CSA have the same word-length. The
shortest word-length terms are summed up first, then the longest terms last. Also, the
proposed rounding scheme is applied to each sum tree: the approximate carry-ins (e.g.: R1,
R2, etc) are fed into all available CSAs, while the exact carry-ins are part of the CSD terms
from the beginning. All intermediate sums are computed with one extra bit, except the last
summation where the final output is rounded off to the desired length. Overall, the sum
tree ensures that a carry is accounted at every CSA and maintained a final output at a fixed
word-length of 11 bits.
Synthesized timing simulation results for this TIM-IF can be found in Appendix C.
56 Chapter 4. Time-interleaved ∆Σ-DAC Implementation

4.1.5. Time-interleaved ∆Σ Modulator


Unlike the 95th -order TIM-IF, the 3rd -order TIM-DSM contains much fewer CSD terms. A
great advantage in implementing the TIM-DSM is that although the inputs to the summers
(see figure 3.6) are different, all 8 paths have identical arithmetic operations. Thus, they all
have the same sum tree structure.
Recall that the feedback loop filter, H(z), is:
H(z) = az −1 − az −2 + z −3
= 2.875z −1 − 2.875z −2 + z −3
= (21 + 20 − 2−3 )z −1 − (21 + 20 − 2−3 )z −2 + z −3

Based on H(z) and figure 3.6, the output of each TIM-DSM path is:

Px (z) = Ux (z) + (21 + 20 − 2−3 )Ex+1 (z) − (21 + 20 − 2−3 )Ex+2 (z) + Ex+3 (z) (4.2)

where Ex (z) is the truncation error from the xth path of the TIM-DSM, and Ux (z) is the
output of the xth path of the TIM-IF.
The sum tree for a TIM-DSM path is shown in figure 4.6. The same summation and
rounding scheme used in the TIM-IF are used for the TIM-DSM to maintain an 11-bit word-
length output. Recall from chapter 2 that a digital limiter was integrated with the DSM.
This ensures that the modulator will operate with a fixed word-length of 11 bits and saturate
to the largest digital value in case of an overflow. Through simulations, an overflow only
occurs when the input amplitude to the TIM-IF-DSM (TIM-IF + TIM-DSM) is close to the
full-scale value, namely > −0.5dBF S. In these cases, even though the TIM-DSM saturates,
the full system simulation still indicated good performance. Therefore, it is not necessary to
assign another overflow bit to the TIM-DSM. In fact, having an overflow bit for the TIM-
DSM would deteriorate performance since this 12th bit would usually be ’0’ and hence does
not contain any information. Thus, after bit truncation, only 3 out of 4 bits actually contain
meaningful data, resulting in a loss of output amplitude.
Synthesized timing results for this TIM-DSM can be found in Appendix C. Since the
digital front-end contains both the TIM-IF and TIM-DSM (i.e.: TIM-IF-DSM), unless men-
tioned otherwise, the behavioural simulations and physical design will contain both blocks.
4.1. Digital Baseband Front-End 57

Figure 4.6: TIM-DSM sum tree


58 Chapter 4. Time-interleaved ∆Σ-DAC Implementation

4.1.6. Digital Integrated Circuits Design Flow


The digital implementation of the TIM-IF-DSM front-end consists of two design phases, soft
design and physical design, as depicted in figure 4.7. The soft design is done using Matlab,
VHDL, and Synopsis; whereas the physical design is done using SOC First Encounter.
The soft design phase requires several iterations before the design is finalized. Initially,
the architectural-level is translated into the register-transfer-level (RTL) using VHDL. If
the VHDL behavioral simulations do not meet the required specification, the architecture
needs to be modified or re-designed. Similarly, if the design fails to meet the required speed
after synthesis, the timing constraints need to be modified. In a linear system, the RTL
behavioural simulations can be verified using a self-checking test bench or a scan chain test.
However, due to the non-linearity nature of ∆Σ modulation, the RTL behavioural simulations
was verified by performing FFT function to obtain their spectra and SNR performance.
In the second phase, the synthesized design is imported into SOC First Encounter for
physical placement and routing. The initialization step involves floor planning and power
planning. The floor planning specifies the chip’s dimensions and its core utilization. The
higher the core utilization, the smaller the area; however, this makes signal routing a difficult
task. In this design, a core utilization of 50% is used for 90nm CMOS. The power planning
step specifies the appropriate VDD/GND rings and grids for uniform power distribution.
Placement is one of the critical steps in physical design. It includes I/Os, standard cell,
and clock tree placement. Placement is done based on timing-driven criteria which require
multiple iterations. Even for the 90nm CMOS process, it is non-trivial to achieve a 500MHz
clock rate in this highly dense design through the use of digital standard cells. The timing
margins (shown in Appendix C) are tight even with exhaustive optimizations. The estimated
clock skew was about 16ps after placement, which is acceptable for a 500MHz clock.
Auto routing involves power and signal routing. Power routing distributes power to all
standard cells. Signal routing ensures all physical geometry rules (e.g.: metal width, spacing,
density, etc) are met while minimizing the propagation delay to meet the timing constraints.
Lastly, design verifications including DRC and LVS are done before the digital front-end
layout is integrated with the custom layout high-speed blocks in Cadence.
4.1. Digital Baseband Front-End 59

Figure 4.7: Digital design flow


60 Chapter 4. Time-interleaved ∆Σ-DAC Implementation

4.1.7. Digital Front-end Simulation Results


This section presents the behavioural simulation results for the digital TIM-IF-DSM front-
end. The behavioural results are first obtained from a VHDL simulator then multiplexed
and post-processed in Matlab under “brick wall” filtering assumption.
Figure 4.8 shows the TIM-IF-DSM VHDL behavioural output spectrum for 0dBFS input
amplitude at different frequencies. For comparison, an equivalent figure, which shows the
same output spectrum simulated in Matlab with floating-point precision, is figure 3.9.
Figure 4.9(a) and 4.9(b) show the SNR and SNDR versus input amplitude, respectively,
for the TIM-IF-DSM simulated in VHDL (fixed-point) and Matlab (floating-point) with
a single tone at 0.25fB . The VHDL behavioural SNR degrades about 3dB on average as
compared to that of the Matlab simulation. On the other hand, the VHDL behavioural
SNDR fluctuates by about ±2dB.
Figure 4.9(c) and 4.9(d) show the SNR and SNDR versus input frequency, respectively,
for TIM-IF-DSM simulated VHDL and Matlab with a 0dBFS input amplitude. Similar to
the previous analysis, there is an average of 3dB degradation in VHDL results as compared
to those in Matlab simulations.
An input sampling frequency of 500MS/s corresponds to a clock period (tCLK )of 2ns.
Tables C.2 and C.3 in appendix C show the worst-case timing margins from Synopsis for the
TIM-IF and TIM-DSM, respectively. These tables show the synthesized timing for each sum
tree alone, then for a full path which contains a sum tree plus D-flipflop (DFF) and buffers.
The synthesized power consumption, which obtained from the digital standard cell library,
is around 51mV based on a 1V supply.
The positive timing margins imply that both the TIM-IF and TIM-DSM can operate at
500MS/s. Since Synopsis obtains the worst-case timing data from the 90nm CMOS standard
cell library and it also accounts for the wiring interconnection, having positive timing margins
indicates that the physical design should function properly.
4.1. Digital Baseband Front-End 61

(a) (b)

(c) (d)

Figure 4.8: TIM-IF-DSM output spectrum for VHDL behavioural simulations with 0dBFS
input amplitude at different frequencies: a) 0.13fB b) 0.25fB c) 0.50fB d) 0.93fB
62 Chapter 4. Time-interleaved ∆Σ-DAC Implementation

(a) SNR vs. Input amplitude (b) SNDR vs. Input amplitude

(c) SNR vs. Input frequency (d) SNDR vs. Input frequency

Figure 4.9: TIM-IF-DSM VHDL Behavioural vs. Matlab Response


4.2. High-Speed Digital Interface 63

4.2. High-Speed Digital Interface


The high-speed digital interface bridges the gap between the digital front-end and the analog
back-end. It consists of three sub-blocks which all operate at the oversampling rate of 4GS/s:
an 8-to-1 multiplexer, a binary-to-thermometer converter, and the switch drivers. From this
point onward, all design and simulations will be done at the transistor level using Cadence.

4.2.1. Multiplexer
Compared to a conventional ∆Σ-DAC, the only additional block in a time-interleaved ∆Σ-
DAC is the multiplexer, as shown in figure 4.1. The purpose of the multiplexer is to serialize
the parallel time-interleaved paths down to a single path. The multiplexing factor is iden-
tical to that of the time-interleaving factor, namely a factor of 8. Traditionally, an 8-to-1
multiplexer that achieves 4GS/s data rate would require three different clock rates: 500MHz,
1GHz, and 2GHz. This work proposes an 8-to-1 “ring” multiplexer which only requires a
single clock rate of 4GHz.
The 8-to-1 ring multiplexer consists of three parts, as depicted in figure 4.10: a ring shift
register, a switch shift register, and a data multiplexer. The ring shift register consists of 8
cascaded DFFs clocked at 4GHz (CLKa). It also consists of 2 transmission gates to set the
ring’s initial state to a known value (of logic 1) at power-up. This ring creates a pulse signal,
S0 , which has a period of 2ns and a pulse width same as CLKa period, namely 250ps. S0
has two purposes: it is used as the 500MHz clock pulse (CLKsw ) for the DFFs in the data
multiplexer, and also used to generate 8 switch signals (S1 − S8 ) in the switch shift register.
Instead of taking S1 − S8 directly from the ring shift register, a separate series of 8 DFFs
(i.e.: switch shift register) is needed because (S1 − S8 ) are not consecutively shifted (by 1
clock cycle) versions of S0 , but rather its delayed versions. These signals activate the switches
of the data paths in reverse as shown figure 4.10.
Figure 4.11 shows the timing diagram for an 8-to-1 ring multiplexer. Before S0 is used as
CLKsw , it has to go through a clock tree, which consists of 5 stages of branching fanout-of-2
buffers to drive 32 DFFs (since there are 8 parallel paths and 4 bits/path). Thus, CLKsw is
delayed by tclk tree from S0 .
64 Chapter 4. Time-interleaved ∆Σ-DAC Implementation

Figure 4.10: An 8-to-1 ring multiplexer


4.2. High-Speed Digital Interface 65

Figure 4.11: Timing diagram for an 8-to-1 ring multiplexer

To meet the correct timing, the switch signals need to be aligned with the data paths
so they can output valid data. Since the same DFF is used everywhere, if the switch shift
register’s clock is aligned with the data multiplexer’s clock (CLKsw ), then the switch signals
are guaranteed to line up with the data paths.

The switch shift register’s DFFs must be clocked at the same rate as CLKa. However,
this clock is required to be delayed (CLKa dly) such that it will be edge-aligned with CLKsw .
According to figure 4.11, a simple solution is to delay CLKa approximately by one DFF plus
the clock tree’s propagation delay as: tclk dly = tdf f + tclk tree .

A more accurate but complicated solution is to use a PLL to align the phases of CLKa dly
and CLKsw . For this design, it is not necessary to use a PLL for exact alignment since a
skew of 20ps can be tolerated as long as all 8 switch pulses are within one data period. To
ensure proper timing alignment, the output multiplexed data will be re-timed by another
DFF at the switch driver before entering the analog back-end.
66 Chapter 4. Time-interleaved ∆Σ-DAC Implementation

Figure 4.12(a) and 4.12(b) show the Cadence simulation results (in TT corner, 27o C) for
the 8-to-1 ring multiplexer (MUX8) operating at 4GHz and 2GHz clock, respectively. Here,
the two clocks (CLKsw and CLKdly ) are aligned within 15ps and all eight switch pulses
(S1 −S8 ) are contained within one data period. Figure 4.12(b) shows that even for a frequency
lower than the one being designed for, the MUX8 still operates properly. Simulations over
different process corners and temperatures (i.e.: SS, 105o C and FF, −40o C) also showed the
MUX8’s functionality.

Figure 4.12: An 8-to-1 ring multiplexer transient response (TT corner) a) 4GHz b) 2GHz
4.2. High-Speed Digital Interface 67

4.2.2. Binary-to-Thermometer Converter and Switch Drivers


A binary-to-thermometer (B2T) converter has been a standard digital block for designs that
operate in the kHz to MHz range. However, the B2T converter in this work is required to
operate at 4GHz, hence eliminating the usage of digital standard cell libraries. To meet the
high-speed timing, its propagation delay is required to be within 1/2 of the CLKa period
(i.e.: 125ps) . This level of performance can only be achieved with custom CMOS logic and
high-speed layout.
Table 4.2 shows the 4-bit B2T conversion and gate logic. Since the TIM-IF-DSM’s
outputs are based on 2’s complement while the DAC requires thermometer code inputs, it is
required to convert the 4-bit data from 2’s complement to unsigned binary representation,
and then to a 15-bit thermometer code.

Table 4.2: Binary-to-thermometer conversion and gate logic

The decimal range for a 4-bit binary word is [-8,7] and [0,15] in 2’s complement and
unsigned binary, respectively. Thus, the conversion from 2’s complement to unsigned binary
would only require an addition of 8. This is accomplished by inverting the most significant bit
68 Chapter 4. Time-interleaved ∆Σ-DAC Implementation

while the remaining bits (V < 2 : 0 > and A < 2 : 0 >) are identical in both representations.
The B2T converter’s area is reduced through gate re-use and its codes propagation delays
are matched as much as possible.
The thermometer codes are used to switch the DAC’s current-steering cells. Before
entering the analog back-end, these codes need to be re-timed and driven by the switch
drivers. The details on switch drivers are discussed in appendix C.

4.2.3. High-Speed Digital Interface Simulation Results


This section presents the transient simulation results (in TT corner, 27o C) for the high-
speed digital interface which contains the 8-to-1 ring multiplexer, binary-to-thermometer
converter, and switch drivers. As an example, figures 4.13 and 4.14 present the expected
theoretical and Cadence simulation results, respectively. The TIM-IF-DSM outputs are first
multiplexed by MUX8, converted to thermometer codes by the B2T converter, then driven
to the analog back-end by the switch drivers. Both figures agree implying that this interface
works properly at 4GHz.
4.2. High-Speed Digital Interface 69

Figure 4.13: High-speed digital interface theoretical response

Figure 4.14: High-speed digital interface Cadence transient response (TT corner)
70 Chapter 4. Time-interleaved ∆Σ-DAC Implementation

4.3. High Speed Analog Back-End


The analog back-end, which operates at 4GS/s, consists of two sub-blocks: a current calibra-
tion circuit and a current-steering DAC. The current calibration technique is used to achieve
the required resolution and current-steering DAC is used to achieve high-speed operation.

4.3.1. Current Calibration Circuitry


As mentioned in chapter 3, a current calibration technique is used to linearize the multi-
bit DAC by dynamically matching its unit current elements. Some calibration techniques
require a calibration period during which the DAC is temporarily disabled.
In this work, a self-calibration is used which does not require a calibration period, thus
allowing the DAC to operate continuously. The details of current calibration principles are
discussed in appendix C. A calibration period Tc of 160ns (i.e.: 10ns/cell calibration time
for 16 current cells) is sufficient for this design.

Practical Considerations

A major challenge in current calibration is matching the output current Iout between the cells.
The main mismatches occur at the calibration switches and the MOS transconductance. The
switch mismatches are due to their sizes, which are required to be small to keep Ileak minimal.
Thus, a mismatch in the charge-injection for each cell is expected [40]. To reduce this effect,
two additional transmission-gate (TG) switches (T2 and T3 ) are added to the main switch
(T1 ) to cancel the charge-injection occurring at the gate of M1 , as depicted in figure 4.15.
To minimize the effect of mismatches between copies of M1 , the transconductance gm can
be made small, thus reducing the drain current’s sensitivity to Vgs variations. To achieve
this task, a secondary current source, I2 , is added in parallel with M1 to sink about 90% of
Iref [37]. Since M1 only sinks the remaining 10% of Iref , its gm can be relatively small.
To achieve a small gm , the W/L aspect ratio should be made as small as possible. Also,
having a large W and an especially large L transistor increase Cgs and improves the matching
of the current cells. Therefore, charge-injection and leakage current effects are reduced in
accordance with equation C.2. However, there is a limitation on how small the W/L aspect
4.3. High Speed Analog Back-End 71

Figure 4.15: Current calibration implementation

ratio can be depending on the supply headroom of the CMOS process. For STMicroelec-
tronics 90nm CMOS process with a 1V supply and 250mV threshold voltage, the maximum
value for Vgs is approximately 450mV. Consequently, the W/L ratio is around 10/1.

Continuous Current Calibration

To make the calibration continuous, the cell that is being calibrated needs to be invisible or
taken off-line from the DAC’s output. In place of this cell, a “dummy” identical cell needs
to fill in the gap. Thus, instead of having 2N − 1 cells for an N-bit DAC, the calibration
network requires 2N cells with the extra one being a dummy. For a 4-bit DAC, there are 16
calibration cells while there are only 15 current-steering cells as shown in figure 4.16.
The dummy current cell has identical design as a regular current cell, except that its
output is dynamically connected to different cell at different time. Initially, the dummy cell
is calibrated first so it is available to fill in for whichever regular cell is in calibration. Each
regular cell is selected one at a time by a 16-stage ring counter operating at 1/Tc . While
this cell is being calibrated for Tc /16 seconds, the calibration switch immediately disconnects
its output from the DAC and switches over to the dummy cell. The dummy cell’s Iout now
becomes the current source for the DAC’s current-steering cell. Upon completion, the switch
72 Chapter 4. Time-interleaved ∆Σ-DAC Implementation

returns to the original state and the next cell is calibrated. The dummy cell’s Iout is now
available for the next cell.
Figure 4.16 shows an example when cell 1 is under calibration for a 4-bit DAC. This
technique ensures that there are always 15 equal currents available at the output terminal,
hence allowing the DAC to operate uninterrupted.

Figure 4.16: Continuous current calibration system for 4-bit DAC

The calibration circuitry, which only consists of a charge-storage MOS transistor and
switches, requires no external components. Thus, it can be integrated together with the
DAC current-steering cells. The calibration simulation results will be shown together with
the current-steering DAC in the next section.
4.3. High Speed Analog Back-End 73

4.3.2. Current-Steering Digital-to-Analog Converter


A current-steering DAC (CS-DAC) is a common choice for high-speed data conversion where
each unit current cell switches a current to either output or ground. For a differential CS-
DAC configuration, which has a high immunity to common mode noise, the current-steering
cell switches the current to either output or its complement.

Current-Steering Cell Configuration

Figure 4.17(a) shows the bias current mirror circuitry, which replicates an off-chip current
source to generate Iref and a current array, Ic < 15 : 0 >, to bias the secondary calibration
source I2 . Here, simple current mirrors are used instead of cascode current mirrors due to
the headroom limitation of a 1V analog supply voltage (VDDa). Figure 4.17(b) shows the
dummy calibration cell schematic which supplies Idummy to whichever current-steering cell
being calibrated.

Figure 4.17: a) Bias current mirror b) Dummy calibration cell schematic

Figure 4.18 shows the current-steering cell with self-calibration circuitry. Here, the TG
switches, (T1 −T3 ) and the MOS transistors (M1 −M3 ) belong to the calibration cell ; whereas
the other three TG switches (T4 −T6 ) belong to the calibration switch network. The remaining
74 Chapter 4. Time-interleaved ∆Σ-DAC Implementation

MOS transistors, M4 − M7 , belong to the current-steering cell, in which, M6 and M7 are the
current-steering (CS) switches.

Figure 4.18: Current-steering cell with self-calibration circuitry

In a conventional configuration, the current-steering switches are connected directly to


the current source. Specifically, M6 and M7 would be NMOS transistors that are connected
to node A and the output loads are connected between their open-drain outputs and VDDa.
In such case, there is no need for the M4 and M5 current mirror. However, in 90nm CMOS
technology where VDDa is only 1V, the limited headroom does not allow the stacking of M1 ,
T5 , M6 , and the output load. Therefore, a simple current mirror consisting of M4 and M5 is
introduced to fold the current over to a new branch that has fewer transistors stacked up,
thus allowing higher output swing. Although a cascode current mirror would give a better
output resistance and supply noise rejection, again the problem of limited headroom and
reduced output swing arises. An alternative to compensate for a simple current mirror is to
use longer gate-length transistors; however, this results in a high gliching noise. Overall, the
current-steering cell in figure 4.18 provides a compromise.
4.3. High Speed Analog Back-End 75

Output Swing and Noise Estimation

Once the current-steering schematic is chosen, the next task is to determine the appropriate
output swing, such that it not only ensures the current mirror’s functionality but also meets
the SNR requirement. For an analog supply voltage of 1V, there is a little available headroom
to start with.
To maintain the current mirror’s functionality, namely keeping M5 in saturation, the
drain-source voltage of M5 requires at least 300mV (i.e.: Vds5 ≈ 300mV). Since M6 and M7
operate as full-swing switches, there is about 100mV drop across each of them. This leaves
at most 600mV per side for the output swing as shown in figure 4.19(a).
To meet the required SNR of 56dB, the output swing has to be large enough to sufficiently
overcome the output noise. Assuming that the main noise source is dominated by thermal
noise and neglecting flicker (1/f) noise at low frequency, the output noise can be modelled
as illustrated in figure 4.19(b).

Figure 4.19: a) Output swing b) Output noise model c) Simplified output noise model

Here, the current-steering switch is represented by its ON resistance, Rsw , and the off-chip
load is represented by a passive load resistance, RL . The thermal noise of a long-channel MOS
2
operating in saturation can be represented as a current source, In,M 5 , connected between its
2
drain and source terminals. The thermal noise of a resistor is a current source, In,R , connected
76 Chapter 4. Time-interleaved ∆Σ-DAC Implementation

in parallel [46]. The simple representations for these noise sources are:

2 2
In,M 5 = 4kT γgm and In,R = 4kT /R (4.3)

where γ has a value of 2/3 for long-channel transistors but higher for deep sub-micron
transistors. Its exact value varies depending on the CMOS process and is still under research.
For example, in [47], γ has a value around 1.6 and 1.8 for PMOS and NMOS, respectively.
Using superposition and assuming that as long as, rds5 >> (Rsw + RL ):
 2
2 rds5 RL 2 2 2 2
Vn,out(r =
ds5 )
In,M 5 ≈ In,M 5 RL (Vrms /Hz) (4.4)
rds5 + Rsw + RL
 2
2 R sw R L 2 2
Vn,out(R sw )
= In,R ≈0 (Vrms /Hz) (4.5)
rds5 + Rsw + RL sw

2
Vn,out(R L)
= (RL ||(rds5 + Rsw )))2 In,R
2
L
2
≈ In,R L
RL2 2
(Vrms /Hz) (4.6)

2
Thus, Vn,out can be simplified as depicted in figure 4.19(c) and:

2 2 2 2 2
Vn,out ≈ (In,M 5 + In,RL )RL = 4kT RL (γgm RL + 1) (Vrms /Hz) (4.7)

Equation 4.7 suggests a small value for RL to minimize the output thermal noise, but this
would also decrease the output swing. In this work, the current-steering cells are designed
for RL =50Ω to ease the impedance-matching with the 50Ω test equipment.
Based on the value RL =50Ω and simulations under nominal conditions, an output swing
around 500mV per side or 1V differential would sufficiently yield an SNR of 56dB.

Output Load Configurations

Since the CS-DAC outputs differential currents, the loads at the open-drained outputs can
be either passive or active, as depicted in figure 4.20. Both contain a resistor which converts
differential currents into a differential voltage:

Vod = Vout+ − Vout− = (Iout+ − Iout− )RL (4.8)

For a passive load in figure 4.20(a), the resistors are connected between the CS-DAC
output and ground. The advantage of a passive load is that there is high bandwidth due
to its simple open-loop configuration. However, the downside is that the output swing is
4.3. High Speed Analog Back-End 77

limited by the available voltage headroom (i.e.: 600mV max) to keep the current source
(M5 ) in saturation. In addition, the CS-DAC’s output resistance per side, Rout , varies
slightly depending on the number of active current cells in use. Rout can be approximated
as:
ro5 + Rsw
Rout ≈ RL k (4.9)
Ncs
where ro5 is the output resistance of M5 , Rsw is the on resistance of switch M6 , and Ncs is
the number of active current cells in use. The variations in Rout directly correspond to the
variations in the output LSB step size, which is a highly undesirable effect that degrades the
CS-DAC’s linearity performance.

(a) Passive load (b) Active load

Figure 4.20: Current-steering DAC output load options

For an active load in figure 4.20(b), the resistor is connected in feedback through a
differential opamp. Since the opamp’s input impedance is much higher than RL , all current
will flow into RL . The active load offers higher swing than that of the passive load since the
output currents are now connected to the opamp’s inputs which act like AC virtual ground.
Also, since Rout looking into Vout is constant, the active load does not suffer Vout variations
as in the case of a passive load. The disadvantages of an active load are limited bandwidth
due to the close-loop (feedback) configuration and non-idealities from the opamp’s design
(e.g.: finite gain, offset, bandwidth, etc). An opamp gain of 60dB is sufficient for this design.
78 Chapter 4. Time-interleaved ∆Σ-DAC Implementation

Figure 4.21 compares the output stair case for active versus passive load. It shows
the increase in Vout variations for passive load, as more current is switched to either side.
Therefore, an active load is more favourable in this design.

Figure 4.21: Active vs. Passive load output (TT corner)

Deep Sub-micron Design Challenges

This design is fabricated using STMicroelectronics 90nm CMOS with 7 metal layers and 1V
supply for the entire chip. While a low power supply lowers the power consumption of a
large digital circuitry, it makes the analog design a challenging task. For instance, this supply
does not allow the stacking of multiple transistors due to the limited available headroom to
maintain the transistor’s operating region.
Another major issue of deep sub-micron technology is the high leakage currents which
include subthreshold leakage, gate oxide tunneling leakage, junction leakage, hot-carrier in-
jection leakage, gate-induced drain leakage, and punch-through leakage currents [48]. As a
result, although the design dissipates minimum dynamic power during switching, its static
leakage power begins to catch up. For instance, the simulated total power consumption for
the digital front end is around 51mW, of which 23mW (45%) is leakage power. There is
a great deal of ongoing research to replace the SiO2 gate dielectric with a high-k dielec-
4.3. High Speed Analog Back-End 79

tric material to combat increasing leakage currents and to sustain the scaling of CMOS
technology.
Lastly, since STMicroelectronics 90nm CMOS was a rather new technology, its model
parameters were still not accurate or well defined. For example, the gate resistance was not
modelled, hence a gate resistor was added to the transistor with a value according to [49]:
Rgsq Wf Rcont
Rg = + (4.10)
3 Nf lG Ncont Nf
where Nf and Wf are the number of fingers and finger width in µm, Rgsq and Rcont are
gate resistance/square and gate contact resistance, and Ncont and lG are the number of gate
contacts and gate length in µm, respectively. Table 4.3 summarizes the CS-DAC design
including transistor sizes, layout, and drain current.

Table 4.3: Analog Back-end Transistor Properties


CS-DAC Transistor Size Layout Current
(NMOS/PMOS) (W/L) (Nf × Wf × L) (mA)

M1 20/0.1 10 × 2µm × 0.1µm 0.67


M2 20/0.1 10 × 2µm × 0.1µm 0.78
Bias M3 40/0.2 20 × 2µm × 0.2µm 0.78
Current M4 40/0.2 20 × 2µm × 0.2µm 0.78
M5 − M20 4/0.2 2 × 2µm × 0.2µm 0.08
M1 40/4 10 × 4µm × 4µm 0.09
M2 14/0.2 7 × 2µm × 0.2µm 0.69
M3 2/0.2 1 × 2µm × 0.2µm 0.08
Dummy/ M4 40/0.2 10 × 4µm × 0.2µm 0.78
Regular M5 40/0.2 10 × 4µm × 0.2µm 0.72
Cell M6 − M7 8/0.1 2 × 4µm × 0.1µm 0.72
T1 − T3 (Wp /Lp ) 4/0.1 1 × 4µm × 0.1µm 0
T1 − T3 (Wn /Ln ) 2/0.1 1 × 2µm × 0.1µm 0
T4 − T6 (Wp /Lp ) 20/0.1 5 × 4µm × 0.1µm 0.78
T4 − T6 (Wn /Ln ) 10/0.1 5 × 2µm × 0.1µm 0.78
80 Chapter 4. Time-interleaved ∆Σ-DAC Implementation

4.3.3. Analog Back-end Simulation Results


Figures 4.22(a) and 4.22(b) show the DNL offsets Monte Carlo analysis for 15 current cells
without and with current calibration, respectively. Without current calibration, the offset
range is almost twice as much as the case with current calibration; this corresponds to an
SNR improvement of almost 6dB.

(a) Current calibration OFF (b) Current calibration ON

Figure 4.22: DNL offset Monte Carlo analysis (TT corner)

Figure 4.23(a) depicts an example of a TIM ∆Σ-DAC output spectrum with and with-
out current calibration. With calibration on, the inband harmonics are reduced, resulting
in higher linearity. Figure 4.23(b) depicts the SNR performance versus input amplitude
with and without current calibration. It shows that with calibration on, there is an average
SNR improvement of 1dB and 5dB for input amplitude below and above -10dBFS, respec-
tively. Unless mentioned otherwise, all subsequent simulation results will have the current
calibration on.
Figures 4.24(a) and 4.24(b) depict the TIM ∆Σ-DAC’s accuracy performance versus input
amplitude without and with transistor mismatch, respectively. Similarly, figures 4.25(a) and
4.25(b) depict the TIM ∆Σ-DAC’s accuracy performance versus input frequency without
and with transistor mismatch, respectively. Without transistor mismatch, the peak SNRs
4.3. High Speed Analog Back-End 81

(a) Output spectrum at 0.25fB (b) SNR vs. Input amplitude

Figure 4.23: TIM ∆Σ-DAC performance with and without current calibration (for active
load, TT corner with transistor mismatch)

are 57dB (9.2 bits) and 62dB (10 bits) for passive and active load, respectively. However,
with transistor mismatch, the peak SNRs degrade to 50dB (8 bits) and 54dB (8.7 bits), and
the dynamic ranges are 52dB and 56dB for passive and active load, respectively.
Lastly, table 4.4 shows the simulated power consumption of the TIM ∆Σ-DAC at 1V sup-
ply. The digital front-end consumes the most power since it contains the most computations
and largest hardware partition.

Table 4.4: TIM ∆Σ-DAC Simulated Power Consumption


Circuit Block Simulated Power
mW % fsampling

Digital Front-end 51 43 500 MS/s


High-speed Interface 38 32 4 GS/s
Analog Back-end 24 20 4 GS/s
I/Os 7 6 4 GS/s
Total (mW) @ 1V Supply 120
82 Chapter 4. Time-interleaved ∆Σ-DAC Implementation

(a) Without transistor mismatch (b) With transistor mismatch

Figure 4.24: TIM ∆Σ-DAC’s SNR/SNDR vs. Input amplitude for a single-tone input at
0.25fB (TT corner)

(a) Without transistor mismatch (b) With transistor mismatch

Figure 4.25: TIM ∆Σ-DAC’s SNR/SNDR vs. Input frequency for a single-tone amplitude
of 0dBFS (TT corner)
4.4. TIM ∆Σ-DAC Integration 83

4.4. TIM ∆Σ-DAC Integration


This section presents the TIM ∆Σ-DAC full chip integration. Before the integration, there
are two separate sub-blocks that have not been discussed, they are the clock divider and I/O
drivers.
The clock divider (figure 4.26(a)) slows down the analog clock (4GHz) by a factor of 8 to
generate a digital clock (500MHz). The effects of jitter on the analog and digital clocks have
not been fully investigated. For the digital domain, it is assumed that clock jitter does not
cause major problems since sufficient timing margin is available at 500MHz. For the analog
domain, careful layout and floor planning are needed to minimized clock skew and jitter.
The I/O driver in figure 4.26(b) is designed for flexibility during testing. It accommodates
4 bits of high-speed input or output data. When used as an output, it drives the TIM-IF-
DSM’s 4-bit multiplexed digital output off chip. When used as an input, it routes the external
4-bit digital data directly to the analog back-end. Therefore, these I/O drivers allow the
designer to debug the digital and analog sections separately.

(a) (b)

Figure 4.26: a) Divide-by-8 clock divider b) I/O Driver

The chip is designed to have several power supplies for better power management and
84 Chapter 4. Time-interleaved ∆Σ-DAC Implementation

testing flexibility. Specifically, the supply for the baseband digital, high-speed interface, and
high-speed analog are VDDd, VDDhs, and VDDa, respectively. The I/O drivers also has
its own power supply, VDDio. Thus, if neccessary, each section of the chip can operate on
a different power supply to improve performance. In this work, the entire chip operates on
a 1V supply. Also, having multiple power supplies give the flexibility to test each section of
the chip separately, while powering down the rest of the chip.
Figures 4.27 and 4.28 show the floor planning and final layout of the TIM ∆Σ-DAC. The
chip occupies about 1.52mm×1.52mm of silicon area in 90nm CMOS technology. The layout
was pad-limited. The pad frame contains all analog/RF pads without ESD protection to
achieve high speed operation. The core area fits within 1.06mm2 , of which 0.34mm2 contains
the digital standard cells for TIM8-IF and TIM8-DSM.
Both the high-speed digital interface and the analog back-end require a custom layout op-
timized for high-speed and low mismatch operations. To accommodate a high-speed layout,
multi-finger transistors with double-gate connections are used to minimize gate resistance. In
addition, high metal layers (M4-M7) are used for high-speed signal routing to minimize sub-
strate parasitic capacitance. To reduce mismatches between current cells, they are routed
as close as possible and contained at least 2 dummy gates on each side. Furthermore, a
common centroid or finger inter-digitation layout is used to improve matching.
Lastly, each subcircuit (e.g.: current mirror) is surrounded with a ring of substrate
contacts to reduce substrate resistance and crosstalk. Multiple N/P-well rings surround each
digital or analog section, as shown in figure 4.28, to provide as much isolation as possible.
4.4. TIM ∆Σ-DAC Integration 85

Figure 4.27: TIM ∆Σ-DAC floor planning

Figure 4.28: TIM ∆Σ-DAC final layout


86 Chapter 4. Time-interleaved ∆Σ-DAC Implementation
Chapter 5
Time-interleaved ∆Σ-DAC Performance

This chapter presents the experimental results of the fabricated TIM ∆Σ-DAC in STMi-
croelectronics 90nm CMOS. Specifically, the accuracy and linearity performance of TIM
∆Σ-DAC are measured. The experimental setup and testing issues are also discussed here.

5.1. PCB Design and Test Setup


Figure 5.1(a) shows a die photo of the TIM ∆Σ-DAC chip. The digital front-end (TIM-
IF-DSM) was designed using 90nm CMOS standard cell libraries (CORE90GPSVT and
CORX90GPSVT). The high-speed digital interface and analog back-end were designed with
custom layout as shown in figure 5.1(b).

In order to test the functionalities of the TIM ∆Σ-DAC, the chip must be packaged then
integrated onto a PCB. Since this chip operates at a relatively high speed, it is important
to select a package and bonding material which are capable of high-speed operation. In
this work, the package is a 32-pin ceramic FlatPack (FP32) that uses gold bond wires and
supports an operating frequency up to 7GHz. Figures 5.2(a) and 5.2(b) show the packaged
chip and its integration with the PCB, respectively.

The PCB supports several testing configurations. Firstly, it permits testing with either a
passive or active load. Switches steer the output currents to either a grounded 50Ω resistor

87
88 Chapter 5. Time-interleaved ∆Σ-DAC Performance

(a) TIM ∆Σ-DAC die photo (b) Custom layout section

Figure 5.1: Die photos of the TIM ∆Σ-DAC chip fabricated in 90nm CMOS

(a) Chip package and bonding (b) TIM ∆Σ-DAC test PCB

Figure 5.2: TIM ∆Σ-DAC prototype a) Packaging and b) Testboard


5.1. PCB Design and Test Setup 89

or a resistor in feedback around a differential opamp. Due to the low voltage supply (1V) and
broad bandwidth (250MHz) of the TIM ∆Σ-DAC, it is hard to find a commercial differential
opamp which will not limit the performance of the device under test (DUT). For example,
a differential opamp from Texas Instruments (THS4508) has sufficient gain and bandwidth;
however, its minimum common-mode output voltage is still higher than 1V. Since the DUT’s
outputs are open-drain PMOS devices, having a drain voltage higher than its supply can
damage the entire chip via the forward-biased diode in the N-well. Thus, a passive load of
50Ω is used for all measurements.
Secondly, to allow higher testing flexibility, the PCB is designed to support testing with
either an Agilent 93000 SOC tester or an Agilent 81250 parallel bit-error-rate (ParBert)
tester as the input source. The full test setup is depicted in figure 5.4. Both the 93K SOC
and ParBert testers have the ability to test the entire chip, the digital front-end or the analog
back-end alone. For full-chip testing, the tester will send a 10-bit digital pattern (generated
from Matlab) to the TIM ∆Σ-DAC. A 2-way 180o power combiner is used to convert the
differential outputs into a singled-ended analog output. Lastly, a spectrum analyzer is used
to analyze the analog spectrum and to capture data for the Matlab post-processing. Due to
some design issues in the digital front-end, which will be discussed in the next section, the
full chip was not tested. Thus, only test results from analog back-end and its interface (i.e.:
B2T converter & switch driver) are reported.
To test the analog back-end alone, a VHDL simulation is used to generate the 4-bit dig-
ital output of the TIM-IF-DSM. This data is then multiplexed in Matlab and transferred to
the ParBert, which in turn sends a 4-bit data pattern to the chip’s I/Os. Figures 5.3 and
5.5 show the analog test flow and its experimental setup, respectively.

Figure 5.3: Analog back-end test flow


90 Chapter 5. Time-interleaved ∆Σ-DAC Performance

Figure 5.4: Full test setup for Agilent 93K SOC or Agilent ParBert platform

Figure 5.5: Experimental setup for analog back-end


5.2. Digital Design Issues and Solutions 91

5.2. Digital Design Issues and Solutions


Initial testing of the digital front-end of the fabricated chip revealed four errors in the digital
design. Firstly, the time-interleaved paths were multiplexed in the forward order instead
of reverse order (as shown in figure 4.11). Since the digital front-end and analog back-end
were designed using different CAD tools, there was not a full transistor-level schematic for
LVS purposes. The LVS was done separately for each section before final integration; the
full-chip was verified manually, thus resulting in human errors. Also, the “path-reversal”
detail was not well-defined at the transistor-level and was not discovered until later. This
explains why the “path-reversal” was emphasized throughout chapters 3 and 4.
Secondly, the 2’s complement numbers were not converted into unsigned numbers before
thermometer code conversion (as discussed in section 4.2.2). While the TIM-IF-DSM’s
outputs were 2’s complement, the B2T converter design, which was done in different CAD
tool, used only unsigned test patterns. A simple solution is to add an extra inverter at the
MSB as proposed in section 4.2.2. This error could be avoided if the full chip was designed
using the same CAD tool, thus eliminating manual inspections during full chip integration.
In addition, two other design errors, which would not cause faulty results but would limit
the TIM ∆Σ-DAC’s performance, appeared in the fabricated chip:

1. The “roundoff error reduction scheme” (section 4.1.2): The fabricated TIM-IF utilized
the “truncation with no rounding” technique which caused large roundoff errors and
limited the SNR of entire TIM ∆Σ-DAC.

2. The “unnecessary overflow bit” (section 4.1.5): The fabricated TIM-DSM was over-
designed to have a final sum of 12 bits because a digital limiter was not yet introduced.
Since the 12th bit was mostly ’0’, this resulted in a loss of output amplitude.

Thus, the digital front-end measurements were omitted since its outputs will not contain
meaningful data. However, the simulations in chapter 4 and the analog back-end’s exper-
imental results in this chapter employ a digital design with all of these errors corrected.
Here, the corrected VHDL digital front-end results are imported into Cadence for a full-chip
mixed-signal simulation; this method detects any system integration or design errors.
92 Chapter 5. Time-interleaved ∆Σ-DAC Performance

5.3. High Speed Analog Measurements


This section presents the experimental results for the analog back-end (i.e.: CS-DAC) with
4-bit post-multiplexed, ∆Σ-modulated inputs. Although the analog test setup is intended
for 4GS/s data rate, the available ParBert can only support a data rate up to 2.66GS/s.
Thus, unless mentioned otherwise, all measurements were taken at a sampling rate, fS , of
2.66GS/s; this corresponds to an analog bandwidth, fB , of 166MHz.

5.3.1. Initial Verifications


A few verifications were carried out before measuring the CS-DAC’s dynamic range perfor-
mance. Firstly, it was important to verify that all current cells are operating by sweeping
through all possible digital codes. Figure 5.6(a) shows a stair case transient response of the
CS-DAC for digital inputs ranging from 1111 to 0000. The 16 different voltage levels in
this figure show that all current cells are functional. The differential output swing (Vout )
is around 600mV, which corresponds to an average step size of 40mV. While the simulated
differential Vout is 1V with typical device models, it did not account for post layout para-
sitics and PVT variations. Simulations at the slow process corner and 105o C resulted in only
720mV peak-to-peak output swing as depicted in figure 5.6(b).

(a) Measured stair case transient (b) Simulated stair case transient (SS, 105o C)

Figure 5.6: Current-steering DAC stair case transient response


5.3. High Speed Analog Measurements 93

Secondly, the current calibration circuitry was verified. For a passive load, aside from
an increase of around 100mV in Vout , there was no improvement in accuracy or linearity
regardless of calibration being on or off. Instead, having the calibration on introduced some
calibration feed-through which mixed with the fundamental signal, and generated inband
tones. For example, figures 5.7(a) and 5.7(b) show the calibration feed-through tones for an
input at 0.29fB and CLKcalib at 0.4fB an 0.2fB , respectively. In figure 5.7(a), the second-
order intermodulation product (IM2), fsignal + fcalib , shows up at 0.7fB (marker 4). In figure
5.7(b), the IM2 shows up at 0.49fB (marker 3), as well as its harmonics at marker 2 and 4.
Thus, the calibration circuitry is switched off for all subsequent measurements.

(a) Calibration feed-through for 0.29fB input and (b) Calibration feed-through for 0.29fB input and
CLKcalib = 0.4fB CLKcalib = 0.2fB

Figure 5.7: Output spectrum with calibration feed-through

Lastly, the clock divide-by-8 circuitry was verified even though it was intented to generate
a clock for the digital front-end. Figures 5.8(a) and 5.8(b) show two examples of the clock
divider operating with 2.66GHz and 2.0GHz clock inputs, respectively. The divided clocks
are 332.9MHz and 249.9MHz, implying the clock divider operates correctly.
94 Chapter 5. Time-interleaved ∆Σ-DAC Performance

(a) Divided-by-8 clock for 2.66GHz input (b) Divided-by-8 clock for 2GHz input

Figure 5.8: Clock Divider Transient Response

5.3.2. Accuracy Measurements


Figures 5.9(a) and 5.9(b) depict the CS-DAC transient response for a delta-sigma modulated
single-tone 0dBFS input at 0.13fB and 0.29fB , respectively. Figures 5.10(a)-5.10(d) depict
their noise shaped and inband spectra, as well as the inband harmonics.

(a) Transient response for 0.13fB input (b) Transient response for 0.29fB input

Figure 5.9: CS-DAC transient response for a single-tone, 0dBFS input amplitude (top -
single ended outputs; bottom - differential output)
5.3. High Speed Analog Measurements 95

(a) Noise shaped spectrum for 0.13fB input (b) Inband spectrum for 0.13fB input

(c) Noise shaped spectrum for 0.29fB input (d) Inband spectrum for 0.29fB input

Figure 5.10: Noise shape and inband spectra for a single-tone, 0dBFS input amplitude at
0.13fB and 0.29fB
96 Chapter 5. Time-interleaved ∆Σ-DAC Performance

Figure 5.11 shows the measured and simulated CS-DAC accuracy performance. The
measured SNR and SNDR are almost identical since the dominant noise source is the noise
floor rather than the harmonic distortions. For the measurements versus amplitude, the
input is a single tone at 0.25fB ; for the measurements versus frequency, the input amplitude
is 0dBFS. Figure 5.11(a) shows a peak measured SNR/SNDR of 46dB, which corresponds
to an accuracy of 7.3 bits. The dynamic range is also around 46dB. Figures 5.11(b) shows
a measured accuracy of at least 44dB (7 bits) up to 0.8fB , and 38dB (6 bits) for the entire
bandwidth. Compared to the simulated results (passive load with transmistor mismatch),
there is an average discrepancy of 5dB due to unaccounted parasitics and PVT variations.

(a) SNR and SNDR vs. Input amplitude (b) SNR and SNDR vs. Input frequency

Figure 5.11: CS-DAC accuracy performance with single-tone input and passive load
5.3. High Speed Analog Measurements 97

5.3.3. Linearity Measurements


A two-tone test is used to measure the CS-DAC linearity (SFDR) performance; in which,
the input contains two tones at f1 and f2 . The third-order intermodulation (IM3) products
are located at 2f1 − f2 and 2f2 − f1 . The SFDR is measured as the amplitude difference
between the output tones at f1 and f2 and their IM3s. In this test, the two input tones have
an amplitude of -6dBFS and are separated by 0.004fB or 0.665MHz. Figures 5.12(a) and
5.12(b) show the two-tone test spectra for input tones near 0.25fB and 0.93fB ; their SFDR
measurements are 56.3dB and 55.4dB, respectively. Figure 5.12(c) shows the SFDR around
55.4dB up to 0.93fB input frequency; this corresponds to a linearity of 8.9 bits.

(a) Two-tone spectrum near 0.25fB (b) Two-tone spectrum near 0.93fB

(c) SFDR vs. Input frequency

Figure 5.12: Two-tone spectrum and SFDR measurements


98 Chapter 5. Time-interleaved ∆Σ-DAC Performance

Another linearity measurement is the “missing-tone” test, in which the inband spec-
trum contains multiple equally-spaced tones except leaving the middle one empty. The
intermodulation products of these tones will be concentrated at the empty bin, causing the
“missing-tone” to appear. The amplitude difference between the input signal tones and the
“missing-tone” is called the Multi-tone Power Ratio (MTPR), which reflects the system lin-
earity. This test is particularly relevant for systems employing OFDM since the transmitted
spectrum consists of many sub-channels at equally-spaced frequencies.
This experiment uses 128 tones (sub-channels) based on an UWB standard from [6], in
which the 64th tone is left empty. For a bandwidth of 166MHz, this corresponds to a sub-
channel spacing of 1.3MHz. Figures 5.13(a) and 5.13(b) show the multi-tone noise shaped
spectrum and the MTPR measurement, respectively. The measured MTPR is 38dB.

(a) Multi-tone noise shaped spectrum (b) Multi-tone power ratio measurement

Figure 5.13: Multi-tone Test


5.3. High Speed Analog Measurements 99

5.3.4. Power Consumption


Table 5.1 shows the simulated and measured power consumption for the TIM ∆Σ-DAC at
1V supply. In practice, the digital front-end consumed much more power than that predicted
in synthesis, even at half the operating speed. This discrepancy indicates that the power
estimation of the digital CAD tools are over optimistic and inaccurate. On the other hand,
the analog back-end consumed less power than in simulation; this translates to the loss of
output swing Vout discussed earlier. This could be the result of inaccuracies in the 90nm
CMOS corner models; in particular, the ST 90nm CMOS version 1.0 design kit was still not
well defined when the chip was designed.

Table 5.1: TIM ∆Σ-DAC Power Consumption


Circuit Block Simulated Power Measured Power
mW % fsampling mW % fsampling

Digital Front-end 51 43 500 MS/s 75 70 250 MS/s


High-speed Interface 38 32 4 GS/s 14 13 2.66 GS/s
Analog Back-end 24 20 4 GS/s 15 14 2.66 GS/s
I/Os 7 6 4 GS/s 3 3 2.66 GS/s

Power Distribution
Total (mW) @ 1V Supply 120 107

The measured power consumption is 107mW; in which, 32mW is due to the analog pro-
totype sampled at 2.66GS/s and 75mW is due to the digital front-end sampled at 250MS/s.
The digital front-end was tested at 250MS/s instead of 333MS/s due to the speed limita-
tion of the Agilent 93K SOC tester. The measured power consumption is 102mW when the
digital front-end was sampled at 250MS/s while the rest of the chip was sampled at 2GS/s.
Overall, the TIM ∆Σ-DAC power distribution shows that the digital front-end consumes the
most power since it contains a large amount of digital circuitry and computation volume.
100 Chapter 5. Time-interleaved ∆Σ-DAC Performance

5.3.5. Performance Summary


Table 5.2 summarizes and compares the TIM ∆Σ-DAC simulated versus measured perfor-
mance. In both cases, the digital front-end’s VHDL behavioural simulation results are used
as inputs to the analog back-end. The simulated results are for the active or passive load
in typical corner with and without transistor mismatch (i.e.: Mis and Typ). The measured
results are for the passive load only.

Table 5.2: TIM ∆Σ-DAC Performance Summary


Parameter Simulated Measured Units
Active load Passive load Passive load
Typ. Mis. Typ. Mis.

Peak SNR 62 54 57 50 46 dB
Peak SNDR 60 52 55 48 46 dB
Dynamic Range 63 56 58 52 46 dB
Peak SFDR - - - - 56 dB
MTPR (128 tones) - - - - 38 dB
Bandwidth (fB ) 250 166 MHz
Sampling Rate (fS ) 4 2.66 GS/s
Oversampling Ratio (OSR) 8 8 -
Supply Voltage 1 1 V
Power 120 107 mW
Area 1.52mm × 1.52mm
Process Technology STMicroelectronics 90nm CMOS, 7M2T
Chapter 6
Conclusions

In conclusions, this thesis presents the analysis and design of a time-interleaved delta-sigma
digital-to-analog converter (TIM ∆Σ-DAC). The digital front-end of the TIM ∆Σ-DAC
comprises a 95th -order time-interleaved-by-8 FIR interpolation filter (TIM-IF) and a 3rd -
order, 4-bit, time-interleaved-by-8 ∆Σ modulator (TIM-DSM). The analog back-end of the
TIM ∆Σ-DAC comprises a 4-bit current-steering DAC with continuous current calibration.
The high-speed digital interface between these two domains comprises of an 8-to-1 ring
multiplexer, a binary-to-thermometer converter, and 15 switch drivers.

The time-interleaved architecture uses parallelism based on block digital filtering to sup-
port a low OSR of 8; this results in a large effective bandwidth for broadband applications.
The TIM-DSM utilizes an error-feedback architecture with optimized NTF zero to improve
SNR performance. The digital front-end (TIM-IF-DSM) implementation uses CSD repre-
sentation with rounding scheme for minimum round-off errors, and parallel CSA adders with
optimized staging for minimum propagation delays.

The eight parallel outputs of the TIM-IF-DSM is serialized into a single 4-bit stream
through an 8-to-1 ring multiplexer. These bits are converted into thermometer codes then
into analog signal using 15 current-steering cells. An additional dummy current-steering
cell is used to allow continuous current calibration. The differential analog outputs are
open-drain which gives the flexibility of having either a passive or an active output load.

The TIM ∆Σ-DAC was designed to operate at 4GS/s with a bandwidth of 250MHz.

101
102 Chapter 6. Conclusions

The simulation results show a peak SNR of 62dB and 57dB for active and passive load with
no transistor mismatch, respectively; the peak SNRs are 54dB and 50dB, with transistor
mismatch.
The chip was fabricated in STMicroelectronics 90nm CMOS. The analog back-end was
tested with modulated data from VHDL simulation of the digital front-end. It was measured
at 2.66GS/s and achieved a bandwidth of 166MHz, an SNR of 46dB and an SFDR of 56dB.
At 2GS/s, the prototype consumed 102mW from a 1V supply.
Table 6.1 briefly compares the performance this work with the prior state-of-the-art
which utilizes parallelism in ∆Σ modulation (either time-division multiplexing, TDM, or
time-interleaving, TIM).

Table 6.1: TIM ∆Σ-DAC Performance Comparisons


Ref ∆Σ-DAC fS fB SNR SFDR Power Process/VDD/
Architecture (MHz) (MHz) (dB) (dB) (mW) Test Results

[25] TDM2, OFB 350 29.16 73.4 76 62 0.13µm CMOS /


Clara OSR = 6, 1.5V /
3rd -order DSM, Measured
6-bit DAC
[26] TIM2, EFB 352.8 22.05 - - - FPGA
Khoini- OSR = 8,
Poorfard 2nd -order DSM, Simulated
1-bit DAC
[27] TIM4, MASH 640 40 73 87 - 0.18µm CMOS /
Choi OSR = 6, 1.8V /
2nd -order DSM, Simulated
6-bit DAC
This TIM8, EFB 2660 166 46 56 107 90nm CMOS /
Work OSR = 8, 1.0V /
3rd -order DSM, Simulated &
4-bit DAC Measured
6.1. Future Work 103

6.1. Future Work


Further future work can be done for this design. First of all, the digital front-end design
issues discussed in chapter 5 must be corrected before re-fabrication. The challenge of
digital round-off errors would still exist, however its effect on the overall performance can be
minimized though clever and efficient rounding schemes. There is still ongoing research to
minimize the accumulation of round-off errors during digital arithmetic operations.
Secondly, the current-calibration circuitry needs further design to improve the TIM ∆Σ-
DAC’s linearity. Also, the current-steering DAC’s output resistance can be improved to
reduce its sensitivity to the number of active current cells under a passive load. However,
the issues of deep sub-micron CMOS and low power supply may also limit this design choice.
Thirdly, since this design requires a large amount of hardware integration, extensive post-
layout simulations, together with PVT variations and transistor mismatch, will give a better
estimate of the actual performance. Furthermore, a full-chip transistor-level simulation,
which comprises of a place & route digital front-end and a custom-layout analog back-end,
would definitely yield a higher level of design confidence.
Lastly, the idea of parallelism has been widely used in ∆Σ-ADC, yet it is almost forgotten
in ∆Σ-DAC. A time-interleaved ∆Σ-DAC is quite promising for future broadband applica-
tions which demand high bandwidth and high data rate. The potential of TIM ∆Σ-DAC is
certainly an area of research yet to be fully explored.
104 Chapter 6. Conclusions
Appendix A: Conventional ∆Σ Modulator

The function of a digital ∆Σ modulator (DSM) is to reduce the word-length of the input
signal to a few bits without affecting its in-band spectrum. Since the reduction in word-length
introduces a large truncation error, the modulator must push this added noise outside the
band of interest, hence the term “noise shaping”.
The conventional first-order single-bit DSM is shown in figure A.1. It contains three
main components: the digital loop filter H(z) (i.e.: Σ), the bit truncator T, and the feedback
delay & subtractor (i.e.: ∆). Although this system is highly non-linear, a simple linear model
in the z-domain can be used to analyze its operation. Since the main noise component is
generated by the truncator T, its linear model is represented by an additive noise source,
E(z).

Figure A.1: Linear model of first-order ∆Σ modulator

From figure A.1, the input and output of a first-order DSM can be related as follows:

V (z) = U (z) + (1 − z −1 )E(z) (A.1)

Equation A.1 can be written in the general form,

V (z) = ST F (z)U (z) + N T F (z)E(z) (A.2)

105
106 Appendix A. Conventional ∆Σ Modulator

where the signal transfer function, ST F (z) = 1 and the noise transfer function, N T F (z) =
(1−z −1 ). Here, the signal is the exact replica of the input while the truncation noise is shaped
by a high-pass response (which suppresses the noise near DC and amplifies the out-of-band
noise). For a nth -order lowpass DSM, the system transfer function is:

V (z) = U (z) + (1 − z −1 )n E(z) (A.3)

in which N T F (z) = (1 − z −1 )n .
If the input signal is a full-scale sine wave with peak amplitude A and the truncation
error is assumed to be uniformly distributed, the signal to noise ratio (SNR) for 1st -order
DSM can be approximated as [1]:

9A2 (OSR)3
SN R = (A.4)
2π 2

In equation A.4, the OSR is the oversampling ratio which defines how fast the system is
oversampled with respect to the Nyquist-rate. It is the ratio between the system sampling
frequency, fS , and twice the signal bandwidth, fB (i.e.: the Nyquist sampling frequency).

fS
OSR = (A.5)
2fB

The resolution of a data converter is often specified by its effective number of bits (ENOB)
which is related to the output SNR (in dB) with a sine-wave input by the following equation:

SN R = 6.02EN OB + 1.76 (A.6)


SN R − 1.76
⇒ EN OB = (A.7)
6.02

In a ∆Σ-DAC, the SNR can be controlled by three main parameters: the OSR, the order
of H(z), and the number of truncator bits. Increasing any of these parameters will increase
the SNR which directly translates to an improvement in ENOB. However, there are always
trade-offs between resolution, speed, power consumption, and design complexity.
Appendix B: TIM ∆Σ-DAC Matlab Results

B.1. Analog Reconstruction Filter


From figure 3.2, since the -3dB bandwidth is limited to 235MHz by the digital IF, the pass-
band requirement for this analog LPF should also be around this frequency. Furthermore,
from figure 3.9, the out-of-band truncation noise should be attenuated by at least 50dB so
that the final spectrum is at the same level as the noise floor (around -100dBFS).
According to ([37], Ch. 14), the order of this analog filter should be at least one order
higher than that of the ∆Σ modulator (i.e.: ≥ 4). If the analog filter has the same order as
that of the DSM, the slope of the rising truncation noise matches up with the filter’s falling
attenuation, resulting in a constant spectral density all the way up to fS /2. In addition, this
filter must be able to strongly attenuate the high-frequency truncation noise concentrated
around fS /2.
Based on these requirements, an analog LPF can be designed using Matlab. Elliptic
filter is chosen due to its high attenuation rate and low ripple passband response. Also, an
odd-order elliptic filter has an advantage over that of even-order due to its deep notch at
fS /2 which is desirable in this design. The elliptic filter design details are summarized in
table B.1 and its responses are shown in figure B.1.
Figure B.2 shows the TIM-IF-DSM output spectrum together with the analog LPF at
0dBFS input amplitude and different input frequencies, ranging from 0.13fB to 0.93fB where
1
fB = f =250MHz.
16 S

Figure B.3 shows the TIM-IF-DSM response for the system with an ideal “brick-wall”
filter and for the system with an analog LPF. Compared to the ideal filter, the analog LPF
results in about 2.3dB and 1.2dB degradation in SNR and SNDR, respectively, as depicted

107
108 Appendix B. TIM ∆Σ-DAC Matlab Results

Table B.1: Analog Low-pass Filter Characteristics


Parameter Description

Filter Type Elliptic


Design Filter Order 7
Passband Frequency 230MHz
Stopband Frequency 240MHz
Passband Ripple 0.5dB
Stopband Attenuation 55 dB
Performance -3dB Bandwidth 235 MHz
Passband ripple 0.2 dB
Stopband attenuation ≥ 55 dB

(a) Frequency Response (b) Passband Ripple

Figure B.1: A 7th -order elliptic analog filter response


B.1. Analog Reconstruction Filter 109

(a) (b)

(c) (d)

Figure B.2: TIM ∆Σ-DAC output spectrum with analog LPF for Matlab simulations with
0dBFS input amplitude at different input frequencies a) 0.13fB b) 0.25fB c) 0.50fB d) 0.93fB
110 Appendix B. TIM ∆Σ-DAC Matlab Results

in figure B.3(a). This degradation is quite acceptable since the full TIM ∆Σ-DAC, including
the analog filter, still yields about 9 bits accuracy up to 0.93fB , as depicted in figure B.3(b).

(a) SNR and SNDR vs. Input amplitude (b) SNR and SNDR vs. Input frequency

Figure B.3: TIM ∆Σ-DAC response with an ideal vs. analog filter
B.2. TIM-IF-DSM Output Spectrum with DAC Mismatches 111

B.2. TIM-IF-DSM Output Spectrum with DAC Mismatches

(a) 1% DAC element mismatches (b) 2% DAC element mismatches

(c) 3% DAC element mismatches (d) 4% DAC element mismatches

Figure B.4: TIM-IF-DSM output spectrum with thermometer DAC element mismatches
112 Appendix B. TIM ∆Σ-DAC Matlab Results
Appendix C: TIM ∆Σ-DAC Implementation

C.1. TIM-IF Coefficients

Table C.1: A 95th -order Time-interleaved-by-8 Interpolation Filter Coefficients

113
114 Appendix C. TIM ∆Σ-DAC Implementation

C.2. TIM-IF Sum Trees

Figure C.1: TIM-IF sum tree for path 3 and 7


C.2. TIM-IF Sum Trees 115

Figure C.2: TIM-IF sum tree for path 4 and 6

Figure C.3: TIM-IF sum tree for path 5


116 Appendix C. TIM ∆Σ-DAC Implementation

C.3. TIM-IF and TIM-DSM Timing Synthesis

Table C.2: TIM-IF Synthesized Performance


Description Propagation Delay (ns) Timing Margin (ns)

Sum tree 2 or 8 1.66 0.34


Sum tree 3 or 7 1.69 0.31
Sum tree 4 or 6 1.63 0.37
Sum tree 5 1.54 0.46

Path 1 0.44 1.56


Path 2 or 8 1.91 0.09
Path 3 or 7 1.91 0.09
Path 4 or 6 1.91 0.09
Path 5 1.89 0.11

Table C.3: TIM-DSM Synthesized Performance


Description Propagation Delay (ns) Timing Margin (ns)

Sum Tree Only 1.61 0.39

Path 1 1.87 0.13


Path 2 1.74 0.26
Path 3 1.74 0.26
Path 4 1.75 0.25
Path 5 1.74 0.26
Path 6 1.74 0.26
Path 7 1.74 0.26
Path 8 1.73 0.27
C.4. Binary-to-Thermometer Converter and Switch Drivers 117

C.4. Binary-to-Thermometer Converter and Switch Drivers


Figure C.4 depicts the binary-to-thermometer converter schematic with gate re-use and
signed-to-unsigned number conversion (by adding an extra inverter for bit V < 3 >).

Figure C.4: Binary-to-thermometer schematic

Figure C.5 depicts the schematic of a switch driver and a DFF. The DFF’s purpose is
to sample/re-time the thermometer codes at 4Gs/s to ensure their proper timing alignment.
The latch between output data path (Do) and its complement(Do) is used to align their edge
intersections to half-swings. Lastly, the additional transmission gate on Do path is used for
propagation delay matching.
118 Appendix C. TIM ∆Σ-DAC Implementation

Figure C.5: Switch driver schematic

C.5. Current Calibration Principles


The calibration technique works based on charge storage on the gate-source capacitance
(Cgs ) of CMOS transistors. It uses the same reference current (Iref ) to calibrate all current
cells. The current value of each cell does not need to be the same as Iref but needs to
accurately match the other cells [37]. Figure C.6 shows the calibration principle for one of
the current cells [40].

Figure C.6: Calibration principle a) Calibration b) Operation


C.5. Current Calibration Principles 119

The switches S1 and S2 are in the states depicted in of figures C.6(a) and C.6(b) for
the calibration and operation phases, respectively. During calibration, S1 puts the MOS
transistor M1 into saturation due to its diode connection while S2 allows Iref to flow into
M1 . This forces the gate-source voltage (Vgs ) and the charge on the parasitic capacitance
Cgs of M1 to whatever value required so that its drain current, Ids , equals Iref . During the
operation phase, although S1 is opened, Vgs is theoretically unchanged since the charge on
Cgs is preserved. This allows S2 to source approximately the same current, Iref , from the
output.
In a practical implementation, S1 and S2 are made of MOS transistors. Whenever S1
switches off, its channel charge is partly dumped on to the gate of M1 (called “charge-
injection”), causing the charge on Cgs to decrease by the same amount. This results in a
sudden decrease of Vgs . In addition, another effect causes Vgs to decrease. Although S1 is
off, the reverse-biased diode between its source and substrate is still present, causing Vgs to
decrease gradually due to leakage current [40].
The reduction in Vgs , due to charge-injection (∆q) and leakage current (Ileak ), causes Ids
to decrease as a function of time according to the following calculations [40]:

∆q Ileak
Ids (t) = Iref − gm − gm t (C.1)
Cgs Cgs
q
where Cgs = 32 W LCox and gm = 2µCox W I .
L ds

Thus, equation C.1 can be rewritten as:


r
gm 3 2µ Ids
Ids (t) = Iref − (∆q + Ileak t) = Iref − · (∆q + Ileak t) (C.2)
Cgs 2L Cox W L

Equation C.2 indicates that after a certain time Tc , the cell needs to be re-calibrated to
maintain its output current with a specified accuracy.
120 Appendix C. TIM ∆Σ-DAC Implementation
References

[1] Richard Schreier and Gabor C. Temes, Understanding Delta-Sigma Data Converters.
Hoboken, New Jersey, USA: John Wiley & Sons, Inc, 2005.

[2] R. Khoini-Poorfard, Analysis methods and time-interleaved architectures for oversam-


pling modulators. PhD thesis, University of Toronto, Edward S. Rogers Sr. Department
of Electrical and Computer Engineering, 1995.

[3] Danijela Cabric, Mike S.W. Chen, David A. Sobel, Jing Yang and Robert W. Broder-
sen, “Future wireless systems: UWB, 60GHz, and Cognitive radios,” in IEEE Custom
Integrated Circuits Conference, CICC, pp. 793–796, September 2005.

[4] M. de Courville, S. Zeisberg, M. Muck, and J. Schonthier, “BroadWay - the Way to


Broadband access at 60GHz,” in International Conference on Telecommunication, June
2002.

[5] D. Dardari and V. Tralli, “High-speed indoor wireless communications at 60 GHz with
coded OFDM,” IEEE Transactions on Communications, vol. 47, no. 11, pp. 1709–1721,
November 1999.

[6] B. Razavi, T. Aytur, C. Lam, F. Yang, K. Li, R. Yan, and H. Kang, “A UWB CMOS
transceiver,” IEEE Journal of Solid-State Circuits, vol. 40, no. 12, pp. 2555–2562, De-
cember 2005.

[7] J. Balakrishnan, A. Batra, and A. Dabak, “A multi-band OFDM system for UWB
communication,” in IEEE Conference on Ultra Wideband Systems and Technologies,
pp. 354–358, 2003.

121
122 References

[8] P. Smulders, “60 GHz radio: prospects and future directions,” in Proceedings of IEEE
10th Symposium on Communications and Vehicular Technology, pp. 1–8, November
2003.

[9] T. C. Chen, “Where CMOS is going: trendy hype vs. real technology,” in IEEE Inter-
national Solid-State Circuits Conference, ISSCC, pp. 1–18, February 2006.

[10] Fu-Liang Yang, Jiunn-Ren Hwang, and Yiming Li, “Electrical characteristic fluctuations
in sub-45nm CMOS devices,” in IEEE Custom Intergrated Circuits Conference, CICC,
vol. 1, pp. 691–694, 2006.

[11] Jing Cao, Haiqing Lin, Yihai Xiang, Chungpao Kao, and Ken Dyer, “A 10-bit 1GSam-
ple/s DAC in 90nm CMOS for embedded applications,” in IEEE Custom Intergrated
Circuits Conference, CICC, vol. 1, pp. 165–168, 2006.

[12] K. Doris, J. Briaire, D. Leenaerts, M. Vertregt, and A. van Roermund, “A 12b 500MS/s
DAC with >70dB SFDR up to 120MHz in 0.18µm CMOS,” in IEEE International Solid-
State Circuits Conference, ISSCC, pp. 116–117, 588, February 2005.

[13] Anne Van den Bosch, Marc A. F. Borremans, Michel S. J. Steyaert, and Willy Sansen,
“A 10-bit 1-GSample/s Nyquist current-steering CMOS D/A converter,” IEEE Journal
of Solid-State Circuits, vol. 36, no. 3, pp. 315–324, March 2001.

[14] Chi-Hung Lin and Klass Bult, “A 10-b, 500-Msample/s CMOS DAC in 0.6 mm2 ,” IEEE
Journal of Solid-State Circuits, vol. 33, no. 12, pp. 1948–1958, December 1998.

[15] David B. Barkin, Andrew C.Y. Lin, David K. Su, and Bruce A. Wooley, “A CMOS
oversampling bandpass cascaded D/A Converter with digital FIR and current-mode
semi-digital filtering,” IEEE Journal of Solid-State Circuits, vol. 39, no. 4, pp. 585–593,
April 2004.

[16] Todd S. Kaplan, Joseph F. Jensen, Charles H. Fields, and M. Frank Chang, “A 2-Gs/s
3-bit ∆Σ-Modulated DAC with tunable bandpass mismatch shaping,” IEEE Journal of
Solid-State Circuits, vol. 40, no. 3, pp. 603–610, March 2005.
References 123

[17] Susan Luschas, Richard Schreier, and Hae-Seung Lee, “Radio frequency digital-to-
analog converter,” IEEE Journal of Solid-State Circuits, vol. 39, no. 9, pp. 1462–1467,
September 2004.

[18] Fred Harris, and Pranesh Sinha, “On synthesizing high speed sigma-delta DACs by
combining the outputs of multiple low speed sigma-delta DACs,” in IEEE Conference
on Signals, Systems and Computers, vol. 2, pp. 1050–1054, November 2002.

[19] Richard Schreier, “Quadrature mismatch-shaping,” in IEEE International Symposium


on Circuits and Systems, ISCAS, pp. 675–678, May 2002.

[20] R. Cormier Jr, T. Sculley, and R. Bamberger, “Combining sub-band decomposition


and sigma-delta modulation for wide-band A/D conversion,” in IEEE International
Symposium on Circuits and Systems, ISCAS, pp. 357–360, June 1994.

[21] Mucahit Kozak, and Izzet Kale, “Novel topologies for time-interleaved Delta-Sigma
modulators,” IEEE Transactions on Circuits and Systems-II: Analog and Digital Signal
Processing, vol. 47, no. 7, pp. 639–654, July 2000.

[22] Ian Galton, and Henrik T. Jensen, “Oversampling parallel Delta-Sigma modulator A/D
conversion,” IEEE Transactions on Circuits and Systems - II: Analog and Digital Signal
Processing, vol. 43, no. 12, pp. 801–810, December 1996.

[23] Ramin Khoini-Poorfard, Lysander B. Lim, and David A. Johns, “Time-interleaved over-
sampling A/D converters: Theory and Practice,” IEEE Transactions on Circuits and
Systems-II: Analog and Digital Signal Processing, vol. 44, no. 8, pp. 634–645, August
1997.

[24] Katayoun Falakshahi, Chih-Kong Ken Yang, and Bruce A. Wooley, “A 14-bit, 10-
Msamples/s D/A converter using multibit ∆Σ modulation,” IEEE Journal of Solid-
State Circuits, vol. 34, no. 5, pp. 607–615, May 1999.

[25] Martin Clara, Wolfgang Klatzer, Andreas Wiesbauer, and Dietmar Straeussnigg, “A
350MHz low-OSR ∆Σ current-steering DAC with active termination in 0.13 µm
124 References

CMOS,” in IEEE International Solid-State Circuits Conference, ISSCC, pp. 118–119,


588, 2005.

[26] Ramin Khoini-Poorfard, and David A. Johns, “Mismatch effects in time-interleaved


oversampling converters,” in IEEE International Symposium on Circuits and Systems,
ISCAS, pp. 429–432, 1994.

[27] Yunyoung Choi, and Franco Maloberti, “Design of oversampling current steering DAC
with 640Mhz equivalent clock frequency,” in IEEE International Symposium on Circuits
and Systems, ISCAS, vol. 1, pp. 109–112, May 2002.

[28] Tao Shui, R. Schreier, and F. Hudson, “Mismatch shaping for a current-mode multibit
Delta-Sigma DAC,” IEEE Journal of Solid-State Circuits, vol. 34, no. 3, pp. 331–338,
March 1999.

[29] I. Fujimori, A. Nogi, and T. Sugimoto, “A multibit Delta-Sigma audio DAC with 120-
dB dynamic range,” IEEE Journal of Solid-State Circuits, vol. 35, no. 8, pp. 1066–1073,
August 2000.

[30] M. Annovazzi, V. Colonna, G. Gandolfi, F. Stefani, and A. Baschirotto, “A low-power


98-dB multibit audio DAC in a standard 3.3-V 0.35-µm CMOS technology,” IEEE
Journal of Solid-State Circuits, vol. 37, no. 7, pp. 825–834, July 2002.

[31] T. Hamasaki, Y. Shinohara, H. Terasawa, K. Ochiai, M. Hiraoka, and H. Kanayama,


“A 3-V, 22-mW multibit current-mode Σ∆ DAC with 100dB dynamic range,” IEEE
Journal of Solid-State Circuits, vol. 31, no. 12, pp. 1888–1894, December 1996.

[32] P. Naus, E. Dijkmans, E. Stikvoort, A. McKnight, D. Holland, and W. Brandinal,


“A CMOS stereo 16-bit D/A converter for digital audio,” IEEE Journal of Solid-State
Circuits, vol. 22, no. 3, pp. 390–395, June 1987.

[33] Peter Kiss, Jesus Arias, Dandan Li, and Vito Boccuzzi, “Stable high-order Delta-Sigma
digital-to-analog converters,” IEEE Transactions on Circuits and Systems-I: Regular
Papers, vol. 51, no. 1, pp. 200–205, January 2004.
References 125

[34] P. P. Vaidyanathan, Multirate Systems and Filter Banks. Eaglewood Cliffs, New Jersey,
USA: P T R Prentice Hall, Inc , 1993.

[35] Mucahit Kozak, Mustafa Karaman, and Izzet Kale, “Efficient architectures for time-
interleaved oversampling Delta-Sigma converters,” IEEE Transactions on Circuits and
Systems-II: Analog and Digital Signal Processing, vol. 47, no. 8, pp. 802–810, August
2000.

[36] D. P. Scholnik, “A parallel digital architecture for delta-sigma modulation,” in IEEE


International Midwest Symposium on Circuits and Systems, vol. 1, pp. 352–355, August
2002.

[37] David A. Johns and Ken Martin, Analog Integrated Circuit Design. Toronto, Canada:
John Wiley & Sons, Inc, 1997.

[38] R. Schreier, “Delta-Sigma Toolbox Version 6.0,” 2003.

[39] Jared Welz, and Ian Galton, “Necessary and sufficient conditions for mismatch shaping
in a general class of multibit DACs,” IEEE Transactions on Circuits and Systems-II:
Analog and Digital Signal Processing, vol. 49, no. 12, pp. 748–759, December 2002.

[40] D. Wouter J. Groeneveld, Hans J. Schouwenaars, Henk A. H. Termeer, and Cornelis A.


A. Bastiaansen, “A self-calibration technique for monolithic high-resolution D/A con-
verters,” IEEE Journal of Solid-State Circuits, vol. 24, no. 6, pp. 1517–1522, December
1989.

[41] D.A. Parker and K.K. Parhi, “Area-efficient parallel FIR digital filter implementations,”
in IEEE International Conference on Application-Specific Systems, Architectures and
Processors, pp. 93–111, August 1996.

[42] Bede Liu, “Effect of finite word length on the accuracy of digital filters - A Review,”
IEEE Transactions on Circuits Theory, vol. 18, no. 6, pp. 670–677, November 1971.

[43] Henry Samueli, “An improved search algorithm for the design of multiplierless FIR
filters with powers-of-two coefficients,” IEEE Transactions on Circuits and Systems,
vol. 36, no. 7, pp. 1044–1047, July 1989.
126 References

[44] Kyung-Ju Cho, Kwang-Chul Lee, Jin-Guyn Chung, and Keshab K. Parhi, “Design of
low-error fixed-width modified Booth multiplier,” IEEE Transactions on Very Large
Scale Integration Systems, vol. 12, no. 5, pp. 522–531, May 2004.

[45] Abdellatif Bellaouar, and Mohamed I. Elmasry, Low-Power Digital VLSI Design: Cir-
cuits and Systems. Norwell, Massachusetts, USA: Kluwer Academic Publishers, 2000.

[46] Behzad Razavi, Design of Analog CMOS Integrated Circuits. New York, NY, USA:
McGraw-Hill Companies, Inc, 2001.

[47] Tae-young Oh, Christoph Jungemann, and Robert W. Dutton, “Hydrodynamic sim-
ulation of RF noise in deep-submicron MOSFETs,” in International Conference on
Simulation of Semiconductor Processes and Devices, SISPAD, pp. 87–90, September
2003.

[48] K.Roy, S.Mukhopadhyay, and H.Mahmoodi-Meimand, “Leakage current mechanisms


and leakage reduction techniques in deep-submicrometer CMOS circuits,” in Proceedings
of the IEEE, vol. 91, pp. 305–327, February 2003.

[49] Timothy O. Dickson, Rudy Beerkens, and Sorin P. Voinigescu, “A 2.5-V, 40-Gb/s deci-
sion circuit using SiGe BiCMOS logic,” in Proceedings of the IEEE, pp. 206 – 209, June
2004.

You might also like