4 Multi-Channel Modulation 290
4 Multi-Channel Modulation 290
_
.
Figure 4.2 shows the simplest multitone system to understand.
  
N  QAM-like modulators, along with
possibly one DC/baseband PAM modulator, transmit
  
N+1 subsymbol components X
n
, n = 0, 1, ...,
 
N.
5
(
 
N
  
=  N/2,   and  N  is  assumed  to  be  even  in  this  chapter.)   This  system  is  essentially  the  same  as
the  general   modulator  of   Chapter   1.   X
0
  and  X
N
  are  real   one-dimensional   subsymbols  while  X
n
,
n  =  1, ...,
  
N  1  can  be  two-dimensional (complex)  subsymbols.   Each  subsymbol  represents   one  of
2
bn
messages  that  can  be  transmitted  on  subchannel   n.   The  carrier  frequencies  for the  corresponding
subchannels are f
n
 = n/T, where  T  is again the symbol period.   The baseband-equivalent basis functions
are  
n
(t)  =
  1
T
  sinc
_
t
T
_
    n.
6
The  entire  transmitted  signal  can  be  viewed  as
  
N  + 1  independent
transmission subchannels as indicated by the frequency bands of Figure 4.3.
The  multitone  modulated  signal  is  transmitted  over  an  ISI/AWGN  channel   with  the  correspond-
ing  demodulator  structure  also  shown  in  Figure  4.2.   Each  subchannel   is  separately  demodulated  by
rst   quadrature  decoupling  with  a  phase  splitter  and  then  baseband  demodulating  with  a  matched-
lter/sampler  combination.   With  this  particular  ideal   choice  of   basis  functions,   the  channel   output
basis  functions  
p,n
(t)  are  an  orthogonal set.   There  is  thus  no  interference  between  the  subchannels.
Each  subchannel  may have ISI,  but  as  N  , this  ISI  vanishes.   Thus,  (sub)symbol-by-(sub)symbol
detection  independently  applied to  each  subchannel  implements  an  overall ML  detector.   No  equalizer
is then necessary (large  N) to implement the maximum-likelihood detector.   Thus, maximum-likelihood
5
Capital letters are used for symbol component elements rather than the usual lower case letters because these elements
are  frequency-domain elements, and  frequency-domain quantities in  this  book are  capitalized.
6
The passband basis functions are actually the real and imaginary parts of
_
2/Tsinc(
  t
T
 )  e
2fnt
with receiver scaling
tacitly moved  to  the transmitter as  in  Chapter  2.
293
Figure 4.3:  Illustration of frequency bands for multitone transmission system.
294
detection is thus more easily achieved with multitone modulation on an ISI channel than it is on a single
QAM  or  PAM  signal.   Again,  equalization is  unnecessary  if  the  bandwidth  of  each  tone  is  suciently
narrow to make the ISI on that subchannel  negligible.   Thus a high-speed  modulator and equalizer  are
replaced by
  
N + 1 low-speed modulators and demodulators in parallel.
Multitone  modulation typically uses  a  value  for
  
N  that  ensures  that  the  pulse  response  of  the  ISI
channel   appears  almost  constant  at   H(n/T)
  
=  H
n
  =  H(f)  for [f  n/T[   <  1/2T.   In  practice,   this
means that the symbol period  T  greatly exceeds  the length of the channel pulse response.   The (scaled)
matched lters then simply become the bandpass lters  
p,n
(t) = 
n
(t) = 1/
T  sinc(t/T)  e
(2/T)nt
and the sampled outputs become
Y
n
  H
n
 X
n
 + N
n
  .   (4.1)
The  accuracy  of  this  approximation becomes  increasingly exact  as  N  .   Figure  4.3 illustrates  the
scaling of  H
n
  at the channel  output on each  of the  subchannels.   Each subchannel  scales  the input  X
n
by the pulse-response  gain  H
n
.   Each subchannel has an AWGN with power-spectral density  
2
n
.   Thus,
each subchannel has SNR
n
 =
  
Ln]Hn]
2
2
n
.
Each  subchannel in the  multitone system carries  b
n
  bits per  symbol or
 
b
n
  bits per  dimension (note
both  n  =  0  and  n  =
  
N  are  by  denition  one-dimensional  subchannels  with  subchannel
  
N  viewed  as
single-side-band modulation)
7
.   The total number of bits carried by the multitone system is then
b =
n=0
b
n
  (4.2)
and the corresponding data rate is then
R =
  b
T
  =
n=0
R
n
  ,   (4.3)
where  R
n
= b
n
/T.   Thus the aggregate data rate  R is divided, possibly unequally, among the subchan-
nels.
With suciently large  N, an optimum maximum-likelihood detector is easily implemented as
  
N +1
simple symbol-by-symbol detectors.   This detector  need not search all combinations of  M  = 2
b
possible
transmit symbols.  Each subchannel is optimally symbol-by-symbol detected.   The reason this maximum-
likelihood  detector   is  so  easily  constructed  is  because  of  the  choice  of  the  basis  functions:   multitone
basis  functions  are  generically well-suited  to transmission  over  ISI  channels.   Broader  bandwidth  basis
functions, as might be found in a wider bandwidth single-carrier (for instance QAM) system, are simply
not  well   suited  to  transmission  on  this   channel,   thus   inevitably  requiring  suboptimum  detectors   for
reasons  of complexity.   Chapter  5 investigates the concept  of canonical transmission and nds a means
for  generalizing  QAM  transmission  so  that  it  is  eectively  just  as  good  using  a  device  known  as  the
Generalized Decision Feedback  Equalizer or GDFE, which can in some trivial cases  look like the usual
DFE  of   Chapter  3,   but  in  most  cases   signicantly  diers  in  its  generalized  implementation,   both  at
transmitter and receiver.
This chapter  later trivially shows that multitone modulation, when  combined with a code of gap 
on each AWGN subchannel, comes as close to its theoretical capacity as that same code would come to
capacity on an AWGN. That is, for suciently large  N, multitone basis functions are indeed optimum
for transmission on an ISI channel.
EXAMPLE  4.1.1  (1 +.9D
1
)  This  book  has  repeatedly  used  the  example  of  a  channel
with (sampled) pulse response  1 + .9D
1
to compare performance of various ISI-mitigation
structures.   Multitone  modulation  on  this  example,   as  well   as  on  any  linear  ISI   channel,
achieves  the  highest  performance.   In order  to have  a continuous  channel  for multitone, we
will write that H(f) = 1+.9e
2f
(T  = N).  This channel increasingly attenuates with higher
frequencies, so multitone modulation will be used with subchannel  n = 4 silenced (X
4
 = 0).
7
Typically, subchannel 0  and  subchannel
  
N  are not  used  in  actual systems
295
1 +.9D
1
Gains and SNRs
n   c
n
  H
n
  SNR
n
  b
n
  arg Q-func (dB)
0   8/7   1.9   22.8 (13.6 dB)   1.6   9.2
1   16/7   1 + .9e
/4
= 1.76
,
21.3
o
19.5 (12.9 dB)   2  1.5   9.2
2   16/7   1 + .9e
/2
= 1.35
,
42.0
o
11.4 (10.6 dB)   2  1.2   9.1
3   16/7   1 + .9e
3/4
= .733
,
60.25
o
3.4 (5.3 dB)   2 .5   10
Table 4.1:  Subchannel SNRs for the 1 + .9D channel with gap at 4.4 dB.
N  =  8  is  not  suciently  large  for  the  approximation in  (4.1) to  hold,  but  larger  N  would
make this  example intractable for the  reader.   So, we assume  N  = 8 is suciently  large for
easy illustration of concept.
Table 4.1 summarizes the subchannel gains and SNRs for each of the subchannels in this ex-
ample:  For multitone, the phase choice of the channel is irrelevent to achievable performance,
and the example would be the same for 1 + .9D
1
as for 1 + .9D.
The SNRs in the above table are computed based on the assumption that
  A0
2
  = .181  that
is, SNR
MFB
= 10 dB if binary modulation is used (as in previous studies with this example).
Of course, SNR
MFB
 is a function of the modulated signals and in this example has meaning
only in the context of performance comparisons against previous invocations of this example.
The energy per subchannel has been allocated so that equal energy per dimension is used on
the  7 available dimensions.   With  T  =  N  = 8,  this means  that  the  transmit power  of 1  for
previous  invocations of this example  (that  is one  unit of energy  per  dimension  and symbol
period  1)  is  maintained here  because  the  transmit  power  is
 
n
c
n
/T  =  8/8 = 1.   For  this
simple example, 4 symbol-by-symbol detectors operating in parallel exceeded the performance
of a MMSE-DFE, coming very close to the MFB for binary PAM transmission.   The data rate
remains constant at
 
b = 1, and the power also is constant at
  
c
x
 = 1, so comparison against
the  DFE  of   Chapter  3  on  this   example  is  fair.   The  above  tables  Q-function  arguments
are  found by using the  QAM  formula  d
min
/2 =
_
  3
M1
SNR for  n = 1, 2, 3 and the  PAM
formula  d
min
/2  =
_
  3
M
2
1
SNR  with  M  =  2
bn
  this  approximation  can  be  made  very
accurate by careful design of the signal sets for each subchannel.
For this  example,  N  = 8 is  not suciently  high for all approximations to hold and  a more
detailed analysis of each  subchannels  exact SNR might lead to slightly less  optimistic per-
formance projection.   On the other hand, the energy per subchannel and the bit distribution
in this example are not optimized, but rather simply guessed by the author.   One nds for a
large number of subchannels that performance will come down to 8.8 dB when the tones are
suciently narrow for all approximations to hold accurately.  An innite-length MMSE-DFE,
at best, achieves 8.4 dB in Chapter 3  with Tomlinson precoding, the actual performance at
the same data rate of
 
b = 1 was 1.3 dB lower or 7.1 dB. Thus a MMSE-DFE would perform
about 1.7 dB worse  than multitone on this channel.   (The  GDFE of Chapter  5 allows most
of this 1.7 dB to be recovered,  but also uses several subchannels.)
This example illustrates that there is something to be gained from multitone.  An interesting aspect
is  also the  parallelism inherent  in  the  multitone receiver  and transmitter,   allowing for the  potential of
a very  high-speed  implementation on a  number  of parallel processing  devices.   The  1 + .9D
1
channel
has mild ISI (but is easy to work with mathematically without computers)  the gain of 1.7 dB can be
much larger on channels with severe  ISI.
296
4.2   Parallel  Channels
The  multitone transmission  system  of Section  4.1 inherently  exhibits
  
N  + 1  independent  subchannels.
Generally,   multichannel   systems   can  be   construed  as   having  N  subchannels   (where   in  the   previous
multitone  system,   several   pairs   of   these   subchannels   have  identical   gains  and  can  be   considered  as
two-dimensional subchannels).   This  section  studies  the  general  case  of a  communication channel  that
has  been  decomposed  into  an  equivalent set  of   N  real  parallel subchannels.   Of  particular  importance
is  performance  analysis  and  optimization of  performance  for  the  entire  set  of  subchannels,   which  this
section presents.
The  concept  of  the  SNR  gap  for  a  single  channel  is  crucial  to  some  of  the  analysis,  so  Subsection
4.2.1 reviews  a  single  channel  and  then  Subsections  4.2.2 and  4.2.3 use  these  single-channel  results  to
establish results for parallel channels.
4.2.1   The  single-channel  gap  analysis
The so-called gap analysis of Chapter 1 is reviewed here briey in preparation for the study of parallel
channels and loading algorithms.
On an AWGN channel, with gain P
n
 and noise power spectral density  
2
n
, the SNR
n
 is
  L]Pn]
2
2
n
.   This
channel has maximum data rate or capacity
8
of
 c
n
 =
  1
2
 log
2
(1 + SNR
n
)   (4.4)
bits/dimension.  More generally,
b
n
 =
  1
2
 log
2
(1 +
 SNR
n
  )   (4.5)
Any  reliable  and  implementable  system  must   transmit   at   a  data  rate  below  capacity.   The  gap
is   a  convenient   mechanism  for   analyzing  systems   that   transmit   with
  
b   <   c.   Most   practical   signal
constellations  in use  (PAM, QAM)  have  a  constant  gap for  all
 
b   .5.   For
 
b
n
  <  .5, one  can  construct
systems  of codes  to  be  used  that  exhibit  the  same  constant  gap  as  for
 
b   .5.
9
For  any  given  coding
scheme and a given target probability of symbol error  P
e
, an SNR gap is dened according to
  
=
  2
2 c
 1
2
2
b
 1
  =
  SNR
2
2
b
 1
  .   (4.6)
To illustrate such a constant gap, Table 4.2.1 lists achievable
 
b for uncoded QAM schemes using square
constellations.   For
 
b < 1, this gap concept  allows mild additional hard coding in the uncoded system
to obtain small additional coding gain when  b = .5.
For uncoded  QAM or PAM and  P
e
 = 10
6
, the SNR gap  is constant at 8.8 dB (this assumes use
of special constellations or coding for
 
b   .5).   With uncoded  QAM or PAM and  P
e
 = 10
7
, the gap
would  be  xed  instead  at  about  9.5 dB.  The  use  of  codes,   say  trellis  or  turbo  coding  and/or  forward
error  correction  (see  Chapters  10  and  11)  reduces  the  gap.   A  very  well-coded  system  may  may  have
a  gap  as  low  as  .5  dB  at   P
e
   10
6
.   Figure  4.4  plots
  
b  versus  SNR  for  various  gaps.   Smaller  gap
indicates stronger coding.  The curve for  = 9 dB approximates uncoded PAM or QAM transmission at
probability of symbol error  P
e
  10
6
.   More powerful, but implementable, codes have smaller gaps, as
small as 1 dB. A gap of 0 dB means the theoretical maximum bit rate has been achieved, which requires
innite complexity and delay - see Chapter 8.)  For the remainder of this section and most of these notes,
the same constant gap is assumed for the coding of each and every subchannel.
For  a  given  coding  scheme,   practical  transmission  designs  often  mandate  a  specic  value  for  b,  or
equivalently a xed data rate.   In this case, the design is not for
 
b
max
 =
  1
2
 log
2
(1 + SNR/), but rather
for  b.  The margin measures the excess  SNR for that given bit rate.
8
Capacity in Volume I is simply dened as the data rate with a code achieving a gap of  =  1 (0 dB). This is the lowest
possible gap theoretically and can be  approached by  sophisticated codes, as  in  Volume  II. Codes  exist for which
  
Pe   0 if
b    c.
9
Some of the algorithms in Section 4.3 do not require the gap be constant on each tone, and instead require only tabular
knowledge of  the exact  bn  versus required En  relationship for  each  subchannel.
297
b   .5   1   2   3   4   5
SNR for  P
e
 = 10
6
(dB)   8.8   13.5   20.5   26.8   32.9   38.9
2
2
b
 1 (dB)   0   4.7   11.7   18.0   24.1   30.1
 (dB)   8.8   8.8   8.8   8.8   8.8   8.8
Table 4.2:  Table of SNR Gaps for
  
P
e
 = 10
6
.
2 4 6 8 10 12 14 16
0
0.5
1
1.5
2
2.5
3
SNR (dB)
b
i
t
s
/
d
i
m
e
n
s
i
o
n
Achievable bit rate for various gaps
0 dB
3 dB
6 dB
9 dB
Figure 4.4:  Illustration of bit rates versus SNR for various gaps.
298
(repeated  from Section 1.5)
Denition 4.2.1  The  margin,   
m
,   for  transmission  on  an  AWGN  (sub)channel   with  a
given  SNR,   a  given  number  of   bits  per  dimension
  
b,   and  a  given  coding-scheme/target-
 
P
e
with gap  is the amount by which the SNR can be reduced (increased for negative margin in
dB) and  still maintain  a probability of error at or below the target  P
e
.
Margin is accurately approximated through the use of the gap formula by
m
 =
  2
2
bmax
1
2
2
b
1
  =
  SNR/
2
2
b
1
  .   (4.7)
The margin is the amount by which the SNR on the channel may be lowered before performance degrades
to a probability of error  greater  than the target  P
e
  used  in dening the  gap.   A negative margin in dB
means  that  the  SNR  must  improve  by  the  magnitude  of   the  margin  before  the   P
e
  is  achieved.   The
margin relation can also be written as
b = .5 log
2
_
1 +
  SNR
  
m
_
  .   (4.8)
EXAMPLE  4.2.1  (AWGN with  SNR = 20.5 dB)  An AWGN has SNR of 20.5 dB. The
capacity of this channel is then
 c = .5 log
2
 (1 + SNR) = 3.5  bits/dim   .   (4.9)
With a  P
e
 = 10
6
and 2B1Q,
b = .5 log
2
_
1 +
 SNR
10
.88
_
= 2 bits/dim   .   (4.10)
With concatenated trellis and forward error correction (or with turbo codes  see Chapters
10 and 11), a coding gain of 7 dB at  P
e
 = 10
6
, then the achievable data rate leads to
b = .5 log
2
_
1 +
 SNR
10
.18
_
= 3 bits/dim   .   (4.11)
Suppose a transmission application requires
 
b = 2.5 bits/dimension, then the margin for the
coded system is
m
 =
  2
23
1
2
22.5
1
  =
  63
31
   3 dB   .   (4.12)
This means the noise power would have to be increased  (or the transmit power reduced) by
more than 3 dB before the target probability of error of 10
6
is no longer met.  Alternatively,
suppose a design transmits 4-QAM over this channel and no code is used, then the margin is
m
 =
  2
22
1
2
21
1
  =
  15
3
    7 dB   .   (4.13)
4.2.2   A  single  performance  measure  for  parallel  channels  -  geometric  SNR
In a multi-channel transmission system, it is usually desirable to have all subchannels with the same  P
e
.
Otherwise,  if one subchannel  had signicantly higher  P
e
  than others,  then it would dominate bit error
rate.   Constant  P
e
  can occur  when all subchannels  use the same class of codes with constant gap .   In
this case, a single performance measure characterizes a multi-channel transmission system.   This measure
is a geometric SNR that can be compared to the detection SNR of equalized transmission systems or to
the SNR
MFB
.
299
For a set of  N  (one-dimensional real) parallel channels, the aggregate number of bits per dimension
is
b   =   (1/N) 
N
n=1
b
n
 = (1/N)
N
n=1
1
2
 log
2
_
1 +
 SNR
n
_
  (4.14)
=
  1
2N
  log
2
_
  N
n=1
_
1 +
 SNR
n
_
_
  (4.15)
=
  1
2
 log
2
_
1 +
 SNR
m,u
_
  .   (4.16)
Denition 4.2.2  (Multi-channel SNR)  The  multi-channel  SNR  for  a  set  of  parallel
channels  is dened  by
SNR
m,u
=
_
_
_
  N
n=1
_
1 +
 SNR
n
_
_
1/N
1
_
_
    .   (4.17)
The  multi-channel  SNR  is  a  single  SNR  measure  that  characterizes   the  set  of   subchannels  by  an
equivalent single AWGN that achieves the  same data rate, as in Figure 4.5.   The  multichannel SNR in
(4.17) can be directly compared with the detection SNR of an equalized single channel system.   The bit
rate is given in (4.16) as if the aggregate set of parallel channels were a single AWGN with SNR
m,u
.
When the terms involving +1 can be ignored, the multichannel SNR is approximately the geometric
mean of the SNRs on each of the subchannels
SNR
m,u
  SNR
geo
 =
_
n
SNR
n
_
1/N
.   (4.18)
Returning to Example 4.1.1, the multi-channel SNR (with gap 0 dB ) is
SNR
m,u
 =
 _
(22.8 + 1)(19.5 + 1)
2
(11.4 + 1)
2
(3.4 + 1)
2
.125
 1 = 8.8(dB)   .   (4.19)
This multichannel SNR with  = 0 is very close to the argument that was used for all the Q-functions.
This value is actually the best that can be achieved.
10
SNR
m,u
 = 9.7 dB for  = 8.8 dB, very close to the
value in the sample design of Example 4.1.1. This value is artically high only because the approximation
accruing  to using too small a value of  N  for illustration purposes  - if  N  were  large, SNR
m,u
  would be
close to 8.8 dB even with  > 1, and also the gap approximation loses accuracy when
 
b < 1.
4.2.3   The  Water-Filling  Optimization
To maximize the data rate,  R = b/T, for a set of parallel subchannels when the symbol rate 1/T  is xed
requires maximization of the achievable  b =
 
n
b
n
 over  b
n
 and c
n
.   The largest number of bits that can
be transmitted over a parallel set of subchannels must maximize the sum
b =
  1
2
N
n=1
log
2
(1 +
 c
n
 g
n
  )   ,   (4.20)
where   g
n
  represents   the  subchannel   signal-to-noise  ratio  when  the  transmitter  applies  unit  energy  to
that subchannel.   (For multitone, g
n
 = [H
n
[
2
/(
2
n
).)   g
n
 is a xed function of the channel, but c
n
 can be
varied to maximize b, subject to an energy constraint that
N
n=1
c
n
 = N
 
c
x
  .   (4.21)
10
Somewhat  by  coincidence  because  the  continuous  optimum  bandwidth  from  Chapter  8  is   .88.   This   example  for-
tuitously  can  approximate  .88  very  accurately  only  using  7/8  subchannels  to  get  .875.   In  other  cases,   a  much  larger
number of  subchannels may  need  to  be  used  to  approximate the optimum  bandwidth.
300
Figure 4.5:  Equivalent single channel characterized  by Multi-channel SNR.
301
Using  Lagrange  multipliers,   the  cost  function  to  maximize  (4.20)  subject   to  the  constraint  in  (4.21)
becomes
1
2 ln(2)
n
ln
_
1 +
 c
n
 g
n
_
+
_
n
c
n
N
 
c
x
_
  .   (4.22)
Dierentiating with respect  to c
n
 produces
1
2 ln(2)
1
gn
+ c
n
=    .   (4.23)
Thus, (4.20) is maximized, subject to (4.21), when
c
n
 +
  
g
n
=  constant   .   (4.24)
When  = 1 (0 dB), the maximum data rate  or capacity of the  parallel set  of channels is achieved.
The solution is called the  water-lling solution because  one can construe  the solution graphically by
thinking of the  curve  of inverted  channel  signal-to-noise ratios as being  lled with energy  (water)  to a
constant line as in Figure 4.6.   (Figure 4.6 illustrates the discrete  equivalent of water-lling for a set  of
6  subchannels,  which  will be  subsequently  described  further.)   When   ,= 1,  the  form of  the  water-ll
optimization remains the same (as long as  is constant over all subchannels).   The scaling by  makes
the  inverse  channel  SNR  curve,  /g
n
  vs  n,  appear  more  steep  with  n,  thus  leading to  a  more  narrow
(fewer used subchannels) optimized transmit band than when capacity ( = 1) is achieved.   The number
of bits on each subchannel is then
b
n
 = .5  log
2
_
1 +
 c
n
 g
n
_
  (4.25)
For the example of multi-tone, the optimum water-lling transmit energies  then satisfy
c
n
 +  
  
2
n
[H
n
[
2
  =  constant   .   (4.26)
Figure 4.6 depicts water-lling for a transmission system with 6 subchannels with  g
n
 =
  ]Hn]
2
2
n
.   Equation
(4.26) is solved for the constant when
 
n
c
n
 =  N
 
c
x
  and
  
c
x
  0.   Note in Figure 4.6 that 4 of the six
subchannels  have positive energies,  while 2 were  eliminated for having negative energy, or equivalently
having  a  noise  power  that  exceeded  the  constant  line  of  water  lling.   The  4  used  subchannels  have
energy  that  makes  the  sum  of normalized  noise  and  transmit  energy  constant  for  all.   For  methods  of
computing the water-ll solution, see  the next section.   The term water-lling arises from the analog
of the curve of /g
n
 being a bowl into which water (energy) is poured, lling the bowl until there  is no
more energy to use.   The water will rise to a constant at level in the bowl.  The amount of water/energy
in any subchannel is the depth of the water at the corresponding point in the bowl.
The   water-lling  solution  is   unique  because   the   function  being  minimized  is   convex,   so  there   is
a  unique  optimum  energy  distribution  (and  corresponding  set  of  subchannel   data  rates)  for  each  ISI
channel with multi-channel modulation.
4.2.4   Margin  Maximization  (or  Power  Minimization)
For  many  transmission  systems,   variable data  rate  is  not  desirable.   In  this  case,   the  best  design  will
maximize the performance margin at a given xed data rate.  To maximize xed-rate margin, the designer
equivalently minimizes the total energy (which is also called xed-margin loading)
N
n=1
c
n
  (4.27)
302
Figure 4.6:  Illustration of discrete water-lling for 6 subchannels.
subject to the data rate and margin being xed according to
N
n=1
1
2
 log
2
_
1 +
 c
n
 g
n
_
= b   .   (4.28)
The best margin will then be
m
 =
  Nc
x
N
n=1
c
n
.   (4.29)
The solution to the energy minimization problem after correct dierentiation is again the water-lling
solution given by
c
n
 +
  
g
n
=  constant   .   (4.30)
However, in this case water/energy is poured only until the number of bits per symbol (which is computed
at  each  subchannel   according  to  (4.25)  and  then  summed  over  all  subchannels)  is  equal  to  the  given
xed rate in (4.28).   Then the margin is computed according to (4.29).   Each subchannel has its energy
increased uniformly with respect to that which minimized the sum of energies by a factor of the maximum
margin.   Sometimes  in  multi-user  channels,   see  Chapter  12,   the  solution  with  minimum energy  for  a
given data rate  and given margin is of interest  in what is called  iterative water-lling.  This energy-
minimizing form of the loading problem is known in mathematics as the dual form of the rate-adaptive
formulation.
The  energy  minimization is  equivalent  to  margin  maximization because  any  other  bit  distribution
would have  required  more  energy  to  reach  the  same  given data  rate.   That  greater  energy  would then
clearly leave less energy to be distributed (uniformly) among the remaining subchannels if the new dis-
tribution used the same or a larger number of subchannels.   If the new distribution used less subchannels
than the water-ll, the amount by which the new energy distributions total energy increased still exceeds
the improvement caused by extra energy being distributed over the smaller number of subchannels  for
303
if this were  not the case,  then
 
b could have been  achieved more eciently with the new distribution (a
contradiction).
304
4.3   Loading  Algorithms
Loading  algorithms compute values for  b
n
  and c
n
  for each and every  subchannel  in a parallel set  of
subchannels.   One example of a loading algorithm is the optimum water-lling algorithm of Section 4.2
that solves a set of linear equations with boundary constraints, as described further in this section.   The
solution of these water-lling equations for large N  may produce  b
n
 that have fractional parts or be very
small.   Such  small  or  fractional   b
n
  can  complicate  encoder   and  decoder   implementation.   Alternative
suboptimal loading algorithms approximate the water-ll solution, but constrain  b
n
 to integer values, as
also described  later in this section.
There are two types of loading algorithms  those that try to maximize data rate and those that try
to maximize performance at a given xed data rate.   Both are studied in this section.
Denition 4.3.1  (Rate-Adaptive (RA) loading  criterion)  A rate-adaptive loading pro-
cedure  maximizes  (or  approximately  maximizes)  the  number  of  bits  per  symbol   subject  to  a
xed energy constraint:
max
Ln
b   =
N
n=1
1
2
 log
2
_
1 +
 c
n
 g
n
_
  (4.31)
subject to: N
 
c
x
  =
N
n=1
c
n
  (4.32)
Denition 4.3.2  (Margin-Adaptive (MA) loading  criterion)  A margin-adaptive load-
ing procedure minimizes (or approximately minimizes) the energy subject to a xed bits/symbol
constraint:
min
Ln
c
x
  =
N
n=1
c
n
  (4.33)
subject to: b   =
N
n=1
1
2
 log
2
_
1 +
 c
n
 g
n
_
  (4.34)
The maximum margin is then
max
 =
  N
 
c
x
c
  .   (4.35)
4.3.1   Computing  Water  Filling  for  RA  loading
The set of linear equations that has the water-ll distribution as its solution is
c
1
 + /g
1
  =   K   (4.36)
c
2
 + /g
2
  =   K   (4.37)
.
.
.   =
  .
.
.   (4.38)
c
N
 + /g
N
  =   K   (4.39)
c
1
 +... + c
N
  =   N
 
c
x
  .   (4.40)
There  are  a maximum of  N + 1 equations in  N + 1 unknowns The  unknowns are  the energies c
n
, n =
1, ..., N  and the constant  K.   The solution can produce  negative energies.   If it does, the  equation with
the smallest g
n
 should be eliminated, and the corresponding c
n
 should be zeroed.   The sets of equations
are  solved  recursively  by  eliminating  the  smallest   g
n
  and  zeroing c
n
,   until   the  rst  solution  with  no
negative energies occurs.
305
In matrix form, the equations become
_
_
1   0   0   0   ...   1
0   1   0   0   ...   1
.
.
.
  .
.
.
  .
.
.
  .
.
.
  .
.
.   1
1   1   1   ...   1   0
_
_
_
_
c
1
c
2
.
.
.
c
N
K
_
_
=
_
_
/g
1
/g
2
.
.
.
/g
N
N
 
c
x
_
_
,   (4.41)
which are readily solved by matrix inversion.  Matrix inversions of size  N +1 down to 1 may have to be
performed iteratively until nonnegative energies  are found for all subchannels.
An alternative solution sums the rst  N  equations to obtain:
K =
  1
N
_
N
 
c
x
 +  
N
n=1
1
g
n
_
  ,   (4.42)
and
c
n
 = K /g
n
   n = 1, ..., N  .   (4.43)
If one  or  more  of c
n
  < 0,  then  the  most negative  is  eliminated and  (4.42) and (4.43) are  solved  again
with  N   N  1 and the  corresponding  g
n
  term  eliminated.   The  equations preferably  are  preordered
in terms of signal-to-noise ratio with  n = 1 corresponding to the largest-SNR subchannel  g
1
 = max
n
g
n
and n = N  corresponding to the smallest-SNR subchannel g
N
  = min
n
g
n
.  Equation (4.42) then becomes
at the  i
th
step of the iteration (i = 0, ..., N)
K =
  1
N i
_
c
x
 +  
Ni
n=1
1
g
n
_
  ,   (4.44)
which culminates with N
 = N i for the rst value of i that does not cause a negative energy on c
i
 to
occur, then leaving the energies as and
c
n
 = K /g
n
   n = 1, ..., N
 = N  i   .   (4.45)
The water-lling algorithm is illustrated by the ow chart of Figure 4.7.
The   sum  in  the  expression  for   K  in  (4.42)  is   always  over   the   used  subchannels.   The   following
simplied formulae, apparently rst observed by Aslanis, then determine the number of bits and number
of bits per dimension as
b
n
 =
  1
2
 log
2
_
K  g
n
_
=
  1
2
 log
2
_
1 +
 c
n
 g
n
_
  .   (4.46)
and
b   =
  1
2
 log
2
_
K
N
/N
 (
n=1
g
n
)
1/N
/N
_
=
  1
N
N
n=1
b
n
  (4.47)
=
  1
2
 log
2
_
1 +
 SNR
m,u
_
  ,   (4.48)
respectively.   The SNR for the set of parallel channels is therefore
SNR
m,u
 = 
_
_
K
_
N
/N
 _
N
n=1
g
1/N
n
_
 1
_
  .   (4.49)
The water-lling constant  K  becomes the ratio
K =  
_
  2
2b
n=1
g
n
_
1/N
.   (4.50)
306
Figure 4.7:  Flow chart of Rate-Adaptive water-lling algorithm.
307
EXAMPLE  4.3.1  (1 +.9D
1
again)  The water-ll solution (for  = 1) can be found on
the 1 + .9D
1
example to compute capacity.  First, the subchannels are characterized by
g
0
  =
  1.9
2
.181
  = 19.94   (4.51)
g
1
  =
  1.7558
2
.181
  = 17.03   (4.52)
g
2
  =
  1.3454
2
.181
  = 10.00   (4.53)
g
3
  =
  .7329
2
.181
  = 2.968   (4.54)
g
4
  =
  .1
2
.181
  = .0552   (4.55)
Using (4.42) with all subchannels (N  = 8 or actually 5 subchannels since 3 are complex with
identical  gs on real and imaginary dimensions), one computes
K =
  1
8
_
8 + 1  (
  1
19.94
 +
  2
17.03
 +
  2
10.0
 +
  2
2.968
 +
  1
.0552
)
_
 = 3.3947   ,   (4.56)
and thus c
4
 = 3.3947  1/.05525 = 14.7 < 0.   So the last subchannel should be eliminated
and the equations propagated again.  For  N  = 7,
K =
  1
7
_
8 + 1  (
 1
20
 +
  2
17
 +
  2
9.8
 +
 2
3
)
_
= 1.292   ,   (4.57)
The  corresponding  subchannel   energies,
  
c
n
,   on  the  subchannels   are  1.24,   1.23,   1.19,   and
.96,   which  are  all   positive,   meaning  the  solution  with  positive  energy  on  all   subchannels
has  been  found.   The  corresponding  number  of   bits  per   each  subchannel   is  computed  as
b
n
 =  .5 log
2
(Kg
n
), which has values
 
b=2.3, 2.2, 1.83, and .98 respectively.   The sum is very
close to 1.54 bits/dimension, almost exactly the known capacity of this channel.   Figure 4.8
illustrates the water-ll energy distribution for this channel.
Note the  capacity exceeds  1 bit/dimension, implying that codes  exist  that allow arbitrarily
small probability of error for the selected  transmission rate in earlier uses of this example of
1 bit/dimension.
The accuracy of this computation of capacity can be increased somewhat by further increasing
N   the eventual value is approximately 1.55 bits/dimension.  Then, SNR
m,u
 = 2
2(1.54)
1 =
8.8  dB.   This   is   the   highest   detection-point   SNR  that   any  receiver   can  achieve.   This   is
better than the MMSE-DFE by .4 dB (or by 1.7 dB with error propagation/precoding eects
included).   On more  severe-ISI  channels  (1 + .9D
1
is easy  for computation by hand but is
not  really  a  very  dicult  transmission  channel),   the  dierence  between  multitone  and  the
MMSE-DFE can be very large.   Chapter 5 shows how to correct the MMSE-DFEs transmit
lter and symbol rate in some situations to improve the MMSE-DFEs detection SNR to the
level of multitone, but this is not always possible.
4.3.2   Computing  Water-Filling  for  MA  loading
MA water-lling will have a constant  K
ma
  that satises
c
n
 = K
ma
 
  
g
n
(4.58)
for each used subchannel.   The bit-rate constraint then becomes
b   =
  1
2
N
n=1
log
2
_
1 +
 c
n
 g
n
_
  (4.59)
308
Figure 4.8:  Water-lling energy distribution for 1 + .9D
1
channel with  N  = 8.
=
  1
2
N
n=1
log
2
_
K
ma
  g
n
_
  (4.60)
=
  1
2
 log
2
_
N
n=1
K
ma
  g
n
_
  (4.61)
2
2b
=
_
K
ma
_
N
n=1
g
n
  (4.62)
Thus, the constant for MA loading is
K
ma
 = 
_
  2
2b
n=1
g
n
_
1/N
.   (4.63)
MA  water-ll  algorithm
1.   Order  g
n
 largest to smallest,  n = 1,..., N, and set  i = N.
2.   Compute  K
ma
 = 
_
  2
2b
i
n=1
 gn
_
1/i
(this step requires only one multiply for  i  N 1).
3.   Test subchannel energy c
i
 = K
ma
  
gi
< 0?  If yes, then  i i 1, go to step 2, otherwise continue
4.   Compute solution with  n = 1, ..., N
 = i
c
n
  =   K
ma
 
  
g
n
(4.64)
b
n
  =
  1
2
 log
2
_
K
ma
  g
n
_
  (4.65)
309
310
5.   Compute margin 
max
 =
  N
 
L
x
n=1
 Ln
.
Figure 4.9 illustrates margin-adaptive water-lling.  Figure 4.9 recognizes that the numerator and denom-
inator terms in Equation (4.63) could be quite large while the quotient remains reasonable, so logarithms
are used to reduce the precisional requirements  inside the loop.
EXAMPLE  4.3.2  Returning to the 1 +.9D
1
example, the MA water-ll iterations for a
gap of  = 8.8 dB for a  P
e
 = 10
6
are:
1.   iteration  i = 5 subchannels
K
ma
 = 10
.88
_
  2
16
19.945  17.032
2
 10
2
 2.968
2
 .05525
_
.125
= 6.3228   (4.66)
c
4
 = 6.3228  10
.88
/.0552 = 131 < 0   i i 4   (4.67)
2.   step  i = 4 subchannels
K
ma
 = 10
.88
_
  2
16
19.845  17.032
2
 10
2
 2.968
2
_
1/7
= 4.0727   (4.68)
c
3
 = 4.0727  10
.88
/2.968 = 1.5169 > 0 c
3
 = 3.0337   (4.69)
c
2
 = 6.6283, c
1
 = 7.2547 , c
0
 = 3.6827.
max
 =
  8
3.0337 + 6.6283 + 7.2547 + 3.6827
 = (1/2.57) = 4.1 dB   (4.70)
The  negative  margin  indicates  what  we  already  knew,   this  channel   needs  coding  or  more
energy  (at   least   4.1  dB  more)   to  achieve   a  P
e
  =  10
6
,   even  with  multitone.   However,
gap/capacity arguments  assume  that  a code  with nearly 9 dB  exists.   The  amount of addi-
tional coding necessary  is a minimum with multi-tone (when  N ).
4.3.3   Loading  with  Discrete  Information  Units
Water-lling algorithms have bit distributions where  b
n
 can be any real number.   Realization of bit dis-
tributions with non-integer values can be dicult.  Alternative loading algorithms allow the computation
of bit distributions that are more amenable to implementation with a nite granularity:
Denition 4.3.3  (Information Granularity)  The granularity of a multichannel trans-
mission  system  is  the  smallest  incremental   unit  of  information  that  can  be  transmitted,   .
The number of bits on any  subchannel  is then  given by
b
n
 = B
n
     ,   (4.71)
where  B
n
  0 is an integer.
Typically,  ,  takes  values such  as .25, .5, .75, 1, or  2  bit(s)  with fractional bit constellations  being
discussed  in both Chapters 10 and 11.
There are two approaches to nite granularity.  The rst computes a bit distribution by rounding of
approximate water-ll results, and is called Chows algorithm.  The second is based on greedy methods
in mathematics and rst  suggested by Hughes-Hartog; however, considerable signicant improvements,
a  complete  and  mathematically  veriable  framework,   as  well  as  an  approach  that  circumvents  many
drawbacks in the original Hughes-Hartog methods has recently been  developed independently by Jorge
Campello de Souza
11
and Howard Levin and are now known as Levin-Campello  (LC)  Algorithms.
These  algorithms solve the nite-  loading problem directly, which is not a water-lling solution.   This
section rst describes  Chows algorithm, and then proceeds  to the LC methods.
11
379C  students take  heart, it  was  his  course project!
311
Chows  Algorithm
Peter Chow studied a number of transmission media with severe  intersymbol interference,  in particular
telephone line digital transmission known generally as Digital Subscriber Line (DSL), in his 1993 Stanford
Dissertation.   Chow was able to verify that an on/o energy distribution, as long as it used  the same
or nearly  the  same transmission  band  as  water-lling, exhibits negligible loss with respect  to the exact
water-lling  shape.   The  reader  is  cautioned  not  to  misinterpret  this  result  as  saying  that  at  energy
distribution is as good as water-lling, but rather on/o; where on means at energy distribution in
the same used bands as water lling and o means zero energy distribution where water-lling would
be zero.
Chows  algorithms take  as  much  or  more  computation  than  water-ll  methods,  rasing  the  issue  of
their  utility even  if close  to  optimum.   The  reason  for  the  use  of  on/o  energy  distributions  is  that
practical systems  usually have a constraint on the maximum power spectral  density along with a total
energy/power  constraint.   Such  power-spectral-density  constraints  are  typically  at  over  the  region  of
bandwidth considered for transmission.   Such on/o energy distribution comforts those concerned with
the eect  of emissions from a transmission channel on other transmission systems or electronic devices.
Using this observation, Chow was able to devise algorithms that approximate water-lling with nite
granularity in the bit loading  b
n
 and equal energy c
n
 =
  N
N
c
x
.   Both the RA and MA problems can be
solved with Chows algorithms, which in both cases begin with Chows  Primer below:
Chows  on/o Loading Primer:
1.   Re-order the  g
n
 so that  g
1
 is the largest
2.   Set   b
temp
(N  + 1)  =  0,   i  =  N.   (b
temp
(i)  and  i  are  the  tentative  total  bits  and  number  of  used
subchannels, respectively.)
3.   Let c
n
  =  N
 
c
x
/i  for  n  =  1, ..., i, and  then  SNR
n
  =  g
n
c
n
.   (When  power-spectral   density  limits
apply -  the  energy  level  is  set  at  the  psd  limit and  reset   i  to  i + 1  or  set  all subchannels  at  the
limit ifi = 0, making one last pass through steps  4 and 5).
4.   Compute  b
temp
(i) =
i
n=1
1
2
 log
2
(1 + SNR
n
/).
5.   If  b
temp
(i) < b
temp
(i + 1), then keep the bit distribution corresponding to  b
temp
(i + 1) - otherwise
i i  1 and go to step 3.
The  bit  distribution  produced  by  Chows  Primer  will  likely  contain  many  subchannels   with  non-
integer numbers of bits.
EXAMPLE  4.3.3  (1 +.9D
1
)  The algorithm above can be executed for our example with
 = 0 dB. The 3 passes  necessary  for solution are:
Pass i=4
  
c
n
 = 1,   n = 0, 1, 2, 3, 4.
SNR
0
  =
  1.9
2
.181
  = 19.9448   (4.72)
SNR
1
  =
  1.7558
2
.181
  = 17.032   (4.73)
SNR
2
  =
  1.3454
2
.181
  = 10.000   (4.74)
SNR
3
  =
  .7392
2
.181
  = 2.968   (4.75)
SNR
4
  =
  .1
2
.181
  = .0552   (4.76)
b
temp
(4) = .5log
2
(1+19.9448)+log
2
(1+17.032)+log
2
(1+10)+log
2
(1+2.968)+.5log
2
(1.0552) = 11.8533   .
(4.77)
312
Pass i=3
  
c
n
 = 8/7,   n = 0, 1, 2, 3.
SNR
0
  =
  8  1.9
2
7  .181
  = 22.794   (4.78)
SNR
1
  =
  8  1.7558
2
7  .181
  = 19.4651   (4.79)
SNR
2
  =
  8  1.3454
2
7  .181
  = 11.4286   (4.80)
SNR
3
  =
  8  .7392
2
7  .181
  = 3.3920   (4.81)
b
temp
(3) = .5  log
2
(23.794) + log
2
(20.4651) + log
2
(12.42865) +log
2
(4.3920) = 12.4118   .
(4.82)
Pass i=2
  
c
n
 = 8/5,   n = 0, 1, 2.
SNR
0
  =
  8  1.9
2
5  .181
  = 31.9116   (4.83)
SNR
1
  =
  8  1.7558
2
5  .181
  = 27.2512   (4.84)
SNR
2
  =
  8  1.3454
2
5  .181
  = 16.000   (4.85)
b
temp
(2) = .5  log
2
(32.9116) + log
2
(28.2512) + log
2
(17) = 11.428 < 12.4118   .   (4.86)
STOP and use  i = 3 result.
The algorithmhas terminated with the computation of b = 12.4118 or
b = 1.54 bits/dimension,
the known capacity of this channel for  = 0 to almost 3 digit accuracy, conrming on this
example  that  on/o  energy  is  essentially  as  good  as  water-lling energy  shaping.   The  key
in  both  Chow  and  Water-lling methods  is  that  the  tone  at  Nyquist  is  not  used  -  its  use
corresponds  to about a 10% loss in data rate on this example.
To  illustrate  that  one  could  reorder  the  tones  from smallest  to  largest  in  terms  of  channel
SNR, the 5 passes necessary  for solution follow:
Pass 1  SNR = 160
b
temp
(1) = .5 log
2
 (1 + 160/1) = 3.7
b
temp
 = b
temp
/8 = .46
Pass 2  SNR
0
 = 53.2
SNR
1
 = 45.5
b
temp
(2) = 2.9 + 2(2.75) = 8.4
b
temp
 = b
temp
/8 = 1.1
Pass 3  SNR
0
 = 31.9
SNR
1
 = 27.3
SNR
2
 = 16.0
b
temp
(3) = 2.5 + 2(2.4) + 2(2.0) = 11.3
b
temp
 = b
temp
/8 = 1.42
Pass 4  SNR
0
 = 22.8
SNR
1
 = 19.5
SNR
2
 = 11.4
SNR
3
 = 3.4
b
temp
(4) = 2.3 + 2(2.1) + 2(1.8) + 2(1.05) = 12.4
b = b
temp
/8 = 1.55
313
Figure 4.10:  Illustration of sawtooth energy distribution characteristic of Chows Algorithm.
Pass 5  SNR
0
 = 20.0
SNR
1
 = 17.1
SNR
2
 = 10.0
SNR
3
 = 3.0
SNR
4
 = .055
B = 2.2 + 2(2.1) + 2(1.7) + 2(1) = 11.8
b
temp
 = b
temp
/8 = 1.48
Again, the correct solution was obtained with a larger number of steps.   Designers sometimes
run the algorithm backwards if they expect less than half the subchannels to be used.   Then,
computation is saved.
Chows   RA  Algorithm   To  solve  the  RA  problem,   the  following  3  steps   adjust  the  result   of   the
Chows Primer (with steps 6 and 7 being executed  for all used subchannels):
6.   If
  bn
   
bn
b
old,n
1
so that the new integer
value then satises  b
n
 =
  1
2
 log
2
(1 + SNR
n
/).
7.   If
  bn
 
bn
b
old,n
1
so  that  performance  is the  same  as  other  subchannels  and
then  b
n
 =
  1
2
 log
2
(1 + SNR
n
/).
8.   Compute total energy c  =
n
c
n
 and then scale all subchannels by the factor  N
 
c
x
/c.
This  procedure  produces  a  characteristic  sawtooth energy  distribution  that  may deviate  from a  at
distribution  by  as  much  as 
6
N
  dB,   where
  
N  is  the  dimensionality  of   the  subchannel.   Figure  4.10
314
illustrates  the  sawtooth  energy  produced  by  Chows  algorithm  for  the  SNR(f)  curve  simultaneously
displayed.   The  discontinuities are  points  at which  the  number  of bits  on a  subchannel  changes  by the
smallest information unit  .   An increase  in  b
n
  by    means more  energy  is necessary,  while  a decrease
in  b
n
  by    means  less  energy  is  necessary.   The  curve  however,  on  rough  average,  approximates  a  at
straight line where energy is on.
EXAMPLE  4.3.4  Returning to the earlier 1+.9D
1
example, rounding the bit distribution
with   = 1 bit leads to
2      2.28 = b
0
 = .5  log
2
(23.794)   (4.87)
4      4.355 = b
1
 = log
2
(20.4651)   (4.88)
4      3.6356 = b
2
 = log
2
(12.4286)   (4.89)
2      2.13 = b
3
 = log
2
(4.3920)   (4.90)
The maximum data rate is then 12 bits per symbol or 1.5 bits/dimension.   The new energy
distribution is
.7520      c
0
 =
  8
7
 
  2
22
 1
2
22.28
1
  (4.91)
1.7614      c
1
 =
  16
7
  
  2
4
1
2
4.355
1
  (4.92)
3.000      c
2
 =
  16
7
  
  2
4
 1
2
3.6356
 1
  (4.93)
2.0215      c
3
 =
  16
7
  
  2
2
 1
2
2.1349
 1
  (4.94)
The total energy is then c
x
 = 7.5349, which means that the distribution of energies may all
be increased by the factor 8/7.5349 so that the total energy budget is maintained.   The loss
in performance caused by quantization in this example is 10  log
10
_
(2
3
 1)/(2
3.1
1
10 
log
10
(8/7.5349) or -.08 dB, pretty small!
The  examples use  of    = 1 for the  complex  subchannels  was essentially  equivalent to  a number  of
bits/dimension granularity of 1/2 bit.   The granularity thus need not be the same on all subchannels as
long as  each subchannel  is handled consistently  within the  constraints  of the  granularity of its specic
encoding scheme.
EXAMPLE  4.3.5  (DSL rate adaption)  A number of standards (ANSI T1.413, ITU G.992.1
(G.dmt) and  G.992.2 (G.lite)  use  a  form  of  multitone  described  more  fully in  Section
4.5.   These  sometimes are all generally known as Digital Subscriber  Lines (DSLs) and allow
high-speed  digital transmission  on  phone  lines  with  one  modem  in  the  telephone  company
central oce and the other at the customer.   These all have a mode of operation for internet
access  where  rate-adaptive  loading is  used.   The  symbol rate  is eectively  4000 Hz  in these
methods,  meaning the  tone  width  is  eectively  4000 Hz.   The  granularity in  the  standards
is    =  1  bit  per  two-dimensional  subchannel.   The  T1.413  and  G.992.1 standards  use  256
tones  in  one  direction  of  transmission  known  as  downstream  (G.992.2 uses  128)  and  32-
64 tones  upstream,  the  asymmetry  being  a  match  to  usually asymmetric  le  transfers  and
acknowledgment packet-length ratios in IP trac for internet use.
Thus the data rate can be any multiple of 4kbps with RA loading.  Usually in these systems,
the gap is intentionally increased by 6 dB - this is equivalent to forcing 6 dB of margin when
RA loading completes.   With nominally 100-250 tones being used and average numbers of bits
per tone of 2-9 bits, data rates from 1 to 9 Mbps are possible.   Because of a byte-oriented FEC
code,  the  RA  output  may be  further  rounded  down on  the  subchannels  otherwise  rounded
up until the total number of bits is a multiple of 8 (so to ensure an integer number of bytes).
315
Chows MA Loading algorithm   Chows MA algorithm rst computes a margin, from the data rate
given by the Chow Primer, with respect to the desired xed rate.   It then recomputes the bit distribution
including that margin, which will make the number of bits exactly equal to the desired number.  However,
the  number  of  bits  will still not  be  integer  multiples of     in general.   The  resultant  bit distribution  is
then rounded  as in the  RA algorithm, but the  sum of the number of bits is monitored and rounding is
adjusted to make the sum exactly equal to the desired number of bits per symbol.
The steps that follow the Chow Primer to complete the MA solution are:
6.   Compute
SNR
m,u
 =  
_
_
_
_
  i
n=1
_
1 +
 SNR
n
_
_
1/i
 1
_
_
_
= 2
2
bmax
 1   ,   (4.95)
7.   Compute the tentative margin
temp,m
 =
SNRm,u
2
2
b
1
  n = 1, ..., N
  ,   (4.96)
8.   Compute the updated bit distribution
B
temp,n
 =
  1
2
  log
2
_
1 +
  SNR
n
  
temp,n
_
  n = 1, ..., N
  ,   (4.97)
9.   Round  B
temp,n
  to the nearest integer (if more than one beta is being used for dierent subchannels,
then use the correct value for each subchannel) and recompute energy as
c
temp,n
 =
   
_
2
Bn
 1
_
g
n
n = 1, ..., i   ,   (4.98)
10.   Compute
B
temp
 =
N
n=1
B
temp,n
  ;   b
temp
 = B
temp
    (4.99)
and  check  to  see  if   b
temp
  =  b,  the  desired  xed  number  of  bits  per  symbol.   If  not,   then  select
subchannels that were rounded (perhaps by choosing those close to one-half information granularity
unit) and round the other way until the data rate is correct, recomputing energies as in step 9.
11.   Compute the energy scaling factor
f
e
 =
  N
 
c
x
n=1
c
n
(4.100)
and scale all subchannel energies to
c
n
 f
e
 c
n
  ,   (4.101)
and the maximum margin becomes
max
 = 
temp,max
 f
e
  .   (4.102)
EXAMPLE  4.3.6  (MA  loading  for 1 +.9D
1
channel)  The  rst  step  in  Chows  MA
procedure  nds  SNR
m,u
=8.8  dB  and  the  margin  with  respect   to  transmission  at
  
b  =  1  is
then 10
.88
/(2
2
1) = 4.0 dB. The updated bit distribution computed from step 8 of Chows
MA loading is
b
0
  =
  1
2
 log
2
_
1 +
 (8/7)  19.948
10
.4
_
= 1.7   (4.103)
b
1
  =   log
2
_
1 +
 (8/7)  17.032
10
.4
_
= 3.1   (4.104)
316
b
2
  =   log
2
_
1 +
 (8/7)  10
10
.4
_
= 2.47   (4.105)
b
3
  =   log
2
_
1 +
 (8/7)  2.968
10
.4
_
= 1.2   .   (4.106)
For   = 1, this bit distribution rounds to  b
0
 = 2,  b
1
 = 3,  b
2
 = 2, and  b
3
 = 1.  The number of
bits sums to 8 exactly, and so the procedure concludes.   The energy distribution then changes
to
c
0
     1.7939 =
  8
7
 
  2
4
 1
2
1.7
1
  (4.107)
c
1
     2.1124 =
  16
7
  
  2
3
 1
2
3.1
 1
  (4.108)
c
2
     1.5102 =
  16
7
  
  2
2
 1
2
2.4
 1
  (4.109)
c
3
     1.7618 =
  16
7
  
  2
1
 1
2
1.2
 1
  (4.110)
Total  energy  is  then  7.1783  and  all  subchannels  have  an  additional  margin  of  of  8/7.1783
leading  to  an  additional   margin  of   about   .5  dB,   so  that   the  total   is  then  4.5  dB,   when
a  system  with  powerful   code  (say  gap  is  0  dB  at   P
e
  =  10
6
)  is  used  to  transmit  on  the
1 +.9D
1
channel at
 
b = 1 and the same  P
e
  as the gap used in loading.
Had  = .5 for these subchannels, then the bit distribution would have been  b
0
 = 1.5, b
1
 = 3,
b
2
  = 2.5, and  b
3
  = 1,  which still adds  to  8.   However,  had    =  .5 only for  subchannel  one,
then the rounding would have produced  b = 8.5 and so subchannel 0 would then need to be
rounded down to 2 bits anyway.
Optimum Discrete Loading Algorithms
Optimum discrete  loading algorithms recognize  the  discrete  loading problem as  an instance  of what is
known as greedy optimization in mathematics.  The basic concept is that each increment of additional
information to be transported by a multi-channel transmission system is placed on the subchannel that
would require  the  least incremental  energy  for its transport.   Such  algorithms are optimum for loading
when the information granularity   is the same for all subchannels, which is usually the case.
12
Several denitions are necessary to formalize discrete  loading.
Denition 4.3.4  (Bit distribution vector)  The  bit  distribution  vector  for  a  set   of
parallel subchannels  that in aggregate carry  b total bits of information is given by
b = [b
1
  b
2
  ... b
N
]   .   (4.111)
The sum of the elements in  b is clearly equal to the total number of bits transmitted,  b.
Discrete  loading algorithms can use any monotonically increasing relation between  transmit symbol
energy  and the number  of bits transmitted  on any subchannel.   This function can be  dierent  for each
subchannel,  and there  need  not be  a constant gap used.   In general, the  concept  of incremental energy
is important to discrete loading:
Denition 4.3.5  (Incremental Energy  and Symbol Energy)  The symbol energy c
n
 for
an  integer  number of  information  units  b
n
 =  B
n
    on  each  subchannel   can  be  notationally
generalized to the energy function
c
n
 c
n
(b
n
)   ,   (4.112)
12
The 1+.9D
1
channel example has a PAM subchannel at DC for which this chapter has previously used a granularity of
1 bit per dimension while the other subchannels were QAM and used granularity of 1 bit per two dimensions, or equivalently
1/2 bit per dimension  this anomaly rarely occurs in practice because DC  is almost never passed in transmission channels
and  complex-baseband channels use  a  two-dimensional QAM  subchannel at  the  baseband-equivalent of  DC.
317
where the symbol energys dependence on the number of information units transmitted,  b
n
, is
explicitly shown.   The incremental energy to transmit b
n
 information units on a subchannel
is the  amount of additional energy  required to send the  B
th
n
  information unit  with respect to
the  (B
n
  1)
th
information  unit  (that  is,   one  more  unit  of   ).   The  incremental   energy  is
then
e
n
(b
n
)
  
= c
n
(b
n
)  c
n
(b
n
)   .   (4.113)
EXAMPLE  4.3.7  (Incremental Energies)  As an example, uncoded SQ QAM with  =
1 bit-per-two-dimensions, and minimuminput distance between points on subchannel n being
d
n
, has energy function
c(b
n
) =
_
  2
bn
1
6
  d
2
n
  b
n
 even
2
bn+1
1
12
  d
2
n
  b
n
 odd
  .   (4.114)
The incremental energy is then
e
n
(b
n
) =
_
  2
bn
1
12
  d
2
n
  b
n
 even
2
bn
+1
12
  d
2
n
  b
n
 odd
  .   (4.115)
For large  numbers  of bits,  the  incremental  energy  for SQ  QAM  is approximately twice  the
amount of energy needed  for the  previous constellation, or 3 dB per  bit as is well known to
the reader  of Chapter  1.   A coded  system with   ,= 1 might have a more complicated exact
expression  that  perhaps  is  most  easily  represented  by  tabulation in  general.   For  instance,
the table for the uncoded SQ QAM is
b
n
  0   1   2   3   4   5   6   7   8
c
n
(b
n
)   0
  d
2
n
4
d
2
n
2
5d
2
n
4
5d
2
n
2
21d
2
n
4
21d
2
n
2
85d
2
n
4
85d
2
n
2
e
n
(b
n
)   0
  d
2
n
4
d
2
n
4
3d
2
n
4
5d
2
n
2
11d
2
n
4
21d
2
n
4
43d
2
n
4
85d
2
n
4
d in this table is selected for a given desired probability of error (and the gap approximation thus avoided
since it is an approximation.
13
Alternatively, an energy function for QAM with  = 1 could also be dened via the gap approximation
as
c
n
(b
n
) = 2 
  
g
n
_
2
bn
1
_
  .   (4.116)
The incremental energy is then
e
n
(b
n
)   =
  
g
n
_
2  2
bn
2
bn
_
  (4.117)
=
  
g
n
 2
bn
(2  1)   (4.118)
=
  
g
n
2
bn
(4.119)
=   2  e
n
(b
n
 1)   ,   (4.120)
which  is  again approximately 3  dB  per  bit.   If     =  .25,  then  the  gap-based  incremental  QAM  energy
formula would be
e
n
(b
n
) =
  
g
n
2
bn+.75
_
2  2
.75
_
  ,   (4.121)
and in general for gap-based QAM
e
n
(b
n
) =
  
g
n
2
bn+1
_
2  2
1
_
  .   (4.122)
13
The  SNRm.u  in  this  case  of  no  gap is  obtained by  calculated the usual  formula in  (4.17) with   =  1.
318
Once  an encoding system has been  selected  for the given xed    and for each of the subchannels in
a multichannel transmission system, then the incremental energy for each subchannel can be tabulated
for use in loading algorithms.
Clearly, there are many possible bit distributions that all sum to the same total b.   A highly desirable
property of a distribution is eciency:
Denition 4.3.6  (Eciency  of a  Bit  Distribution)  A  bit  distribution  vector  b  is  said
to be ecient for a  given granularity    if
max
n
[e
n
(b
n
)]  min
m
[e
m
(b
m
 +)]   .   (4.123)
Eciency means that there is no movement of a bit from one subchannel to another that reduces the
symbol energy.   An  ecient  bit  distribution clearly  solves  the  MA  loading problem for  the  given  total
number  of bits  b.   An ecient  bit distribution also solves the  RA loading problem for the  energy  total
that is the sum of the energies c
n
(b
n
).
Levin and Campello have independently formalized an iterative algorithm that will translate any bit
distribution into an ecient bit distribution:
Levin-Campello  (LC) Ecientizing  (EF) Algorithm
1.   m argmin
1iN
 [e
i
(b
i
 +)]   ;
2.   n arg max
1jN
 [e
j
(b
j
)]   ;
3.   While  e
m
(b
m
 + ) < e
n
(b
n
) do
(a)   b
m
 b
m
 +
(b)   b
n
 b
n
n=1
E
n
(b
n
)    min
1iN
[e
i
(b
i
 + )]   .   (4.126)
E-tightness implies that no additional unit of information can be carried without violation of the  total
energy constraint.
To make a bit distribution E-tight, one uses the Levin-Campello E-tightening (ET) algorithm below:
1.   Set  S =
N
n=1
c
n
(b
n
);
2.   WHILE
 _
N
 
c
x
 S  < 0
_
 or
 _
N
 
c
x
 S  min
1iN
 [e
i
(b
i
 +)]
_
IF
 _
N
 
c
x
 S  < 0
_
 THEN
(a)   n argmax
1iN
 [e
i
(b
i
)]   ;
(b)   S  S e
n
(b
n
)
(c)   b
n
 b
n
ELSE
(a)   m arg min
1iN
 [e
i
(b
i
 + )]   ;
(b)   S  S +e
m
(b
m
 +)
(c)   b
m
 b
m
 +
The ET algorithm reduces the number of bits when the energy exceeds the limit.  The ET algorithm
adds additional bits in the least energy-consumptive places when energy is suciently below the limit.
EXAMPLE  4.3.9  (ET  algorithm on  EF  output in  example  above)  If the initial bit
distribution vector for the above example were the EF solution  b = [2 3 2 1 0], the sequence
of ET steps lead to:
1.   b = [2 3 2 0 0] with c 16.49
2.   b = [1 3 2 0 0] with c 11.92
3.   b = [1 2 2 0 0] with c 8.36
4.   b = [1 2 1 0 0] with c 5.32
Thus,   the  data  rate  is  halved  in  this  example  for  E-tightness  and  the  resultant  margin  is
8/5.32 = 1.8dB.
If the ET algorithm is applied to an ecient distribution, the RA problem is solved:
320
The  Levin-Campello RA  Algorithm
1.   Choose any  b
2.   Make  b ecient with the EF algorithm.
3.   E-tighten the resultant  b with the ET algorithm.
The  Levin-Campello MA  Solution   The dual of E-tightness for MA loading is B-tightness:
Denition 4.3.8  (B-tightness)  A  bit  distribution  b  with  granularity    and  total  bits  per
symbol   b is said to be B-tight if
b =
N
n=1
b
n
  .   (4.127)
B-tightness simply states the correct  number of bits is being transmitted.
A B-tightening (BT) algorithm is
1.   Set
 
b =
N
n=1
b
n
2.   WHILE
 
b ,= b
IF
 
b > b
(a)   n argmax
1iN
 [e
i
(b
i
)]   ;
(b)
  
b 
b 
(c)   b
n
 b
n
ELSE
(a)   m arg min
1iN
 [e
i
(b
i
 + )]   ;
(b)
  
b 
b +
(c)   b
m
 b
m
 +
EXAMPLE  4.3.10  (Bit tightening on 1 + .9D
1
channel)  Starting froma distribution
of  b = [0 0 0 0 0], the BT algorithm steps  produce:
1.   b = [0 0 0 0 0]
2.   b = [0 1 0 0 0]
3.   b = [1 1 0 0 0]
4.   b = [1 1 1 0 0]
5.   b = [1 2 1 0 0]
6.   b = [1 2 2 0 0]
7.   b = [1 3 2 0 0]
8.   b = [2 3 2 0 0]
9.   b = [2 3 2 1 0]
The margin for this distribution has previously been found as -4.3 dB because it is the same
distribution produced in the Chow method.   The total energy is 21.6.
The BT algorithm reduces bit rate when it is too high in the place of most energy reduction per unit
of information and increases  bit rate when it is too low in the place of least energy increase  per unit of
information.  If the BT algorithm is applied to an ecient distribution, the MA problem is solved.
321
The  Levin-Campello MA  Algorithm
1.   Choose any  b
2.   Make  b ecient with the EF algorithm.
3.   B-tighten the resultant  b with the BT algorithm.
Some  transmission systems  may use  coding, which  means that  there  could  be  redundant  bits.   Ide-
ally,   one  could  compute  the  incremental   energy  table  with  the  eect   of   redundancy  included  in  the
incremental-energy  entries.   This   can  be  quite  dicult  to  compute.   Generally  more  redundant   bits
means higher gain but also allocation to increasingly more dicult subchannel positions, which at some
number of redundant bits becomes a coding loss.
4.3.4   Sorting  and  Run-time  Issues
Campello in pursuit of the optimized discrete  loading algorithm, studied further  the issue  of computa-
tional complexity in loading algorithms.  Mathematically, a sort of the vector  b can be denoted
b
sort
 = Jb   (4.128)
where  J  is a matrix with one and only one nonzero unit value in each  and every  row and column.   For
instance, the  N  = 3 matrix
J  =
_
_
0   0   1
1   0   0
0   1   0
_
_
  (4.129)
takes  b
3
  and moves it to the rst position 3  1, moves the rst position to the second  position 1  2
and nally places the second position last 2  3.  To unsort,
b = J
b
sort
  (4.130)
since  J
1
= J
n
  (4.133)
which produces an unbiased estimate of the signal at the normalizer output and converges when
0   <
 c
x
2
  .   (4.134)
324
Figure 4.12:  Illustration of reverse channel in bit-swapping.
equivalently  E[
 
U
n
/X
n
] contains only noise, or the more common MMSE  algorithm
W
n,k+1
 = W
n,k
 +
 
U
n
Y
 
n
  ,   (4.135)
which  instead  produces   a  biased  estimate  and  an  optimistic  SNR  estimate  of   (Chapter  3  reader  will
recall SNR
n
 + 1).   The ZF algorithm is preferred  because there is no issue with noise enhancement here
on this one-tap normalizer.  The factor   is a positive gain constant for the adaptive algorithm selected
to balance time-variation versus  noise averaging.
The channel gain is estimated by
H
n
 =
  d
d
n
 W
n
.   (4.136)
The  mean  square  noise,  assuming  ergodicity at  least  over  a  reasonable  time period,  is  estimated  by
time-averaging the quantity [
U
n
[
2
,
 
2
n,k+1
 = (1  
t
) 
2
n,k
 + 
t
[
U
n
[
2
,   (4.137)
where  
t
  is another averaging constant and
0  
t
  < 2   .   (4.138)
Since this noise was scaled by  W
n
  also, then the estimate of  g
n
 is then
 g
n
 =
 [H
n
[
2
 n
2
]Wn]
2
=
  d
2
d
2
n
  1
 
n
2
  .   (4.139)
Since  the  factor   d
2
/d
2
n
  is  constant  and  the  loading  algorithm  need  only  know  the  relative  change  in
incremental energy, that is the incremental energy table is scaled by
 g
n,old
 g
n,new
,   (4.140)
the new incremental-energy entries can be estimated by simply scaling up or down the old values by the
relative increase in the normalizer noise.
4.3.6   (bit-swapping)
If   the  loading  algorithm  moves   or   swaps  a  bit   from  one  tone   to  another,   then  the  multi-channel
transmitter and receiver  need to implement this bit-swap on the same symbol.   A bit-swap is usually
implemented  by  the  receiver  sending  a  pair  of  tone  indices  n  (tone  where  bit  is  deleted)  and  m  (tone
where  bit is added)  through some reverse  channel to the  transmitter, as in Figure 4.12.   There  is often
also an associated change in transmitted energy on each tone:
G
i
 =
 c
i
(new)
c
i
(old)
  for i = m, n   .   (4.141)
The energy  for the  bit-swap tones  i =  m or  i =  n can be  recomputed  by summing the  new increments
in the table,
c
i
(new) =
   g
i,old
 g
i,new
bi
j=1
e
i,old
(j)   .   (4.142)
325
Typically this factor is near unit value and usually less than a 3 dB change for 2-dimensional subchannels
like  those  of  MT  (and  less  than  6  dB  change  for  one-dimensional  subchannels).   The  near-unity-value
occurs because tone  ns factor of decrease in channel SNR
   gn,old
 gn,new
is about 2 and nearly the reciprocal of
the constellation-energy decrease when halving the constellation size on that tone  n.  Similarly, tone  ms
increase in channel SNR by about 2 is nearly the reciprocal of the doubling of energy caused when tone
ms constellation is doubled in size in the swap.   The nearly, but not exactly, unit gain factor  G
i
 should
also be conveyed through the reverse channel to the transmitter.   Some implementations thus ignore this
factor if it is always close to one.
Because the two gain factors may produce a total energy of the transmitter that is not exactly equal
to the allowed energy limit, total energy may be reset also.  The total energy will always decrease  if the
loading algorithm is functioning properly.   After execution of a swap, both the transmitter and receiver
would know by previous agreement if the transmitter plans to increase total energy (which will be slightly
reduced by a swap) back to the maximum allowed c
x
.  If so, then all the entries in the incremental energy
table also could be scaled by this same factor to be accurate  (but it will not cause  any future swaps to
be  any  dierent  if  the  table  is  not  scaled).   Furthermore,   all  the  multi-channel normalizer  coecients
need to be reduced  by the inverse square-root of this factor.
When  a  swap  is  executed,   the  values  for  W
m
  and  W
n
  and  the  corresponding  noise  estimates  need
scaling by an additional factor:   The new values for  W
n
  and   
2
n
 are
W
n
(new) W
n
(old) 
  1
G
n
,    
2
n
(new)   
2
n
(old) 
  1
G
n
,   (4.143)
and similarly for tone index  m.   Nominally, d
n
(new) scales by the energy increase  (decrease)  multiplied
by  about
 _
1/2  (or  by  about
 
2).   However,  the  exact  value  depends  on  the  constellation  details.   A
list of  the  exact  factor  of increases  (decreases)  is  stored  by the  receiver  for  each  possible  increment  or
decrement  to the  constellation.   If dierent  subchannels  use  dierent  constellation types,  then  a  set  of
lists will need to be maintained.  The successive entries in the list are the ratios of distance of successively
larger (smaller) constellations when the energy is held constant.
Systems using the Chow type at energy assumption will simply assume that energy remains the
same and squared distance changes by a factor of 2, perhaps eliminating the book-keeping on how much
energy is transmitted on each tone (unless the number of tone changes).
The accuracy of channel estimation is important and discussed in Section 4.5.
4.3.7   Gain  Swapping
In addition to any energy changes associated with a bit-swap, gain may be changed in the absence  of a
bit-swap and is known as gain swapping.  While MA bit-swapping will continue to move bits from
channels of higher energy cost to those of lower energy cost, there is still a barrier to moving a bit that
is the dierence between the cost to add a bit in the lowest cost position of the incremental-energy table
relative to the savings of deleting that same bit on the highest-cost  position in that same table.   In the
worst case, this cost can be equal to the cost of a unit of information. On a two-dimensional subchannel
with  = 1, this cost could be 3 dB of energy for a particular subchannel.   This means that a probability
of error of 10
6
on a particular subchannel could degrade by as much as roughly 3 orders of magnitude
before  the  swap  occurred.   A  particularly problematic situation would be  a loss  of  for instance  2.9 dB
that never grew larger and the swap never occurred.   With a number of subchannels greater than 1000,
the  eect   would  be  negligible,  but  as   N  decreases,   the  performance  of  a  single  subchannel   can  more
dramatically eect  the overall probability of error.
There are two solutions to this potential problem.   The rst is to reduce  , but this may be dicult.
The second  is alter the  transmit energy  to increase  transmit energy on one subchannel and place it on
another subchannel.   Furthermore, the presence of a dramatically increased or increase noise (for instance
a noise caused by another system turning on or o) might lead to a completely dierent distribution of
energy when loading.  Thus, the energy transmitted on a subchannel (or equivalently the relative factor
applied to such a subchannel) may need to change signicantly.  Gain swapping (sending of the factor of
increase/decrease  of energy  on a subchannel is thus commonly used.   Usually by convention, a value of
b
n
 = 0 on a subchannel means either no energy is to be transmitted or a very small value instead of the
326
nominal level.  If a gain-swap occurs, then the two columns of the incremental energy table needed to be
scaled by the respective factors  and indeed a bit-swap may subsequently occur because of this change.
327
4.4   Channel  Partitioning
Channel   Partitioning  methods  divide  a  transmission  channel   into  a  set  of  parallel,  and  ideally  in-
dependent,   subchannels   to  which  the   loading  algorithms  of   Section  4.3  can  be   applied.   Multi-tone
transmission (MT) is an example of channel partitioning where  each subchannel has a frequency  index
and all subchannels are independent, as in Section 4.1 and Figure 4.2.  The MT basis functions 
n
(t)
form an  orthonormal basis  while also  exhibiting the  desirable  property  that  the  set  of  channel-output
functions h(t)  
n
(t)  remains  orthogonal  for  any  h(t).   Reference  to  Chapter  8  nds  that  the  MT
basis  functions,  when  combined  with  continuous-time  water-lling,  and  N    are  optimum.   For  a
nite  N, the MT basis functions are not quite optimum basically because  the channel shaping induced
by  h(t)  in  each  MT  frequency  band  is not  quite  at  in general.   Furthermore,  the  MT  basis  functions
exist over an innite time interval, whereas practical implementation suggests restriction of attention to
a  nite  observation  interval.   Thus,  MT  makes  a  useful  example  for  the  introduction  of  multi-channel
modulation, but is not practical for channel partitioning.
This section introduces optimum basis functions for modulation based on a nite time interval of T  of
channel inputs and outputs.   Subsections 4.4.1 and 4.4.2 show that the optimum basis functions become
channel-dependent  and  are  the  eigenfunctions  of a certain  channel covariance  function.   There  are  two
ways  to  accommodate  intersymbol  interference  caused  by  a  channel   impulse  response.   Theoretically,
Subsection  4.4.2  investigates  a  conceptual   method  for  computing  best   possibilities.   Subsection  4.4.4
instead investigates the elimination of the interference  between subchannels with the often-encountered
use of guard periods.
4.4.1   Eigenfunction  Transmission
A one-shot multi-channel modulated signal  x(t) satises
x(t) =
N
n=1
x
n
 
n
(t)   ,   (4.144)
while a succession  of such transmissions satises
x(t) =
 
k
N
n=1
x
n,k
 
n
(t kT)   .   (4.145)
The convolution of a channel response  h(t) with  x(t) produces
h(t)  x(t) =
k
N
n=1
x
n,k
 
n
(t  kT)  h(t) =
k
N
n=1
x
n,k
 p
n
(t kT)   ,   (4.146)
which is characterized by N  pulse responses  p
n
(t) ,   n = 1, ..., N. Following a development in Section 3.1,
the set of sampled outputs of the  N  matched lters  p
n
(t)
|p
m
|  |p
n
|
  (4.147)
that characterize  the  N  N  square matrix  Q(t).   The quantity  q
n,m
(t) is the matched-ltered  channel
from  subchannel   m  input  (column  index)  to  subchannel   n  output   (row  index).   The  system  is  now
a  multiple-input/multiple-output transmission  system,   sometimes  called  a  MIMO,   with  sampled-time
representation
Q
k
 = Q(kT)   .   (4.148)
328
Generalization of the Nyquist  Criterion
Theorem 4.4.1  (Generalized Nyquist  Criterion (GNC))  A generalization of the Nyquist
Criterion of Chapter 3  for no  ISI  nor intra-symbol interference (now  both are called  ISI to-
gether) is
Q
k
 = I
k
  ,   (4.149)
that is the sampled MIMO channel  matrix is the identity matrix.
proof:   Since  Q
k
 =  I
k
  = 0 if  k ,= 0, there  is no interference  from other symbols, and since
q
n,m
(t)  =  
n,m
  =  0  if   n ,=  m  there  is  also  no  interference  between  subchannels.   The  unit
value follows from the normalization of each basis function,  q
n,n
(0) = 1.   QED
The  earlier Nyquist Theorem  of Chapter 3 is a special case  that occurs  when  Q
k
  =  q
k
 =  
k
.   Thus,
to avoid ISI, the designer could choose basis functions so that (4.149) holds.   In general, there are many
choices for such functions.   For a sub-optimum set of basis functions that do not satisfy the Generalized
Nyquist Criterion, vector equalization may be used as discussed in Section 5.7.  This section and Chapter
deal only with those that do satisfy the GNC or the GNCs discrete-time equivalents.
4.4.2   Choice  of  transmit  basis
The channel impulse response  forms a channel autocorrelation function according to
r(t) = h(t)  h
(t)   ,   (4.150)
which is presumed to be nonzero only over a nite interval (T
H
/2, T
H
/2).  This channel autocorrelation
function denes a possibly countably innite-size set of orthonormal eigenfunctions, 
n
(t), nonzero
only over the time interval [(T  T
H
)/2, (T  T
H
)/2), that satisfy the relation
14
n
  
n
(t) =
_
  T/2
T/2
r(t )
n
()d   n = 1, ...,   ,   t  [(T T
H
)/2, (T +T
H
)/2) ,   (4.151)
that is a function 
n
(t) that when convolved with the matched-ltered channel over the interval [T/2, T/2)
reproduces itself, scaled by a constant 
n
, called the eigenvalue.  The set of eigenvalues is unique to the
channel autocorrelation  r(t), while the eigenfunctions are not necessarily unique.
  15
Each eigenfunction
represents  a mode of the channel through which data may be transmitted independently of the other
modes with gain 
n
.  The restriction of the time interval to be equal to the symbol period allows intersym-
bol interference to be zero without further consideration.  The restriction is not absolutely necessary, but
then requires the additional constraint that successive  translations of the basis functions at the symbol
rate lead to a  Q
k
  that satises the generalized Nyquist criterion.   This problem is very dicult to solve
when  r(kT) ,= 
k
.   A number of mathematicians have addressed  the problem when  r(kT) = 
k
, and the
theory is known as wavelets, lter banks, polyphase lters, or reconstruction  lters.  However,
the theory is dicult to apply when there is ISI. Subsection 4.4.5 briey discusses this concept.   Section
5.7 solves the problem with inter-block-symbol overlap directly with multidimensional block equalizers.
Modal   Modulation  (MM)  in  Figure  4.13 uses  the  channel-dependent  orthonormal set  of  eigen-
functions (the  N  with largest eigenvalues) as a basis for modulation,
x(t) =
N
n=1
x
n
n
(t)   .   (4.152)
14
Most   developments  of   eigenfunctions   dene  the  functions   over   the  causal   time   interval   [0, T   T
H
)   -   we   instead
noncausaly center  this  interval about  the  origin  at  [(T  T
H
)/2, (T  + T
H
)/2)  in  this  theoretically abstract  case  so  that
taking the limit  as T   produces a true two-sided convolution integral, which will  then have eigenvalues as  the Fourier
transform.
15
The eigenfunctions are generally dicult to compute (unless the symbol period is innite) and used here for theoretical
purposes.   An  approximation method  for  their computation is  listed at  the end  of  this  subsection.
329
Figure 4.13:  Modal Modulation:  optimum channel partitioning.
330
The receiver  signal  y(t) in Figure 4.13 is then
y(t) =
N
n=1
x
n
 [r(t)  
n
(t)] +  n(t)   (4.153)
where   n(t) is the matched-ltered noise with autocorrelation function
  A0
2
   r(t).   There is no information
loss in this signal because it is the output of the matched lters for each of the transmit bases.   Through
Equation (4.151), this signal  y(t) is rewritten
y(t) =
N
n=1
(
n
x
n
)  
n
(t)   ,   (4.154)
which is  a sum  of weighted channel  input samples,   
n
x
n
,  times  the  orthonormal basis  functions.   This
summation has the same form as the modulated waveforms studied in Chapter 1.   Each symbol element
x
n
 is replaced by  
n
x
n
.
The MAP/ML receiver  is thus also the same as in Chapter 1 and recovers  
n
x
n
 by processing each
lter output by a matched-lter/sampler for each of the basis functions, as shown in Figure 4.149.  The
noise  samples  at  the  output  of  these  matched  lters  are  independent  for  dierent  n  because  the  basis
functions are orthogonal, or equivalently,
E[ n
i
 n
j
]   =
  ^
0
2
_
  T/2
T/2
_
  T/2
T/2
r(t s)
i
(t)
j
(s)dtds   (4.155)
=
  ^
0
2
_
  T/2
T/2
i
(t)
j
(t)dt   (4.156)
=   
i
^
0
2
  
ij
  .   (4.157)
Thus, the joint-probability density function p
y/x
  factors into  N  independent component distributions,
and an ML/MAP detector  is equivalent to an independent  detector  on each sample output.   Thus, the
MM transmission system generates a set of parallel channels to which the results of Sections 4.2 and 4.3
can then be applied.   The SNRs are SNR
n
 = (
2
n
c
n
)/(
n
A0
2
  ) = 
n
c
n
/
A0
2
  .
The  functions  p
n
(t)  dened  by  modal  modulation clearly  satisfy  the  generalized  Nyquist  criterion
because  of the independence  of the subchannels  created  and because  they are zero  outside the  interval
[T/2, T/2), thus averting intersymbol interference.
Theorem 4.4.2  (Optimality of  Modal  Modulation)  Given  a  linear  additive  Gaussian
noise channel with ISI, modal modulation produces the largest SNR
m,u
 of any  N-dimensional
set of orthogonal basis functions exist only on  the interval  t  [(T T
H
)/2, (T T
H
)/2).
Proof:  The MM transmission system of Figure 4.149 can be optimally detected by observing
each  of  the  independent  subchannel  outputs  individually.   Its  performance  thus  can  not  be
improved.   Any  set  of  functions  nonzero  over  the  interval [(T  T
H
)/2, (T  T
H
)/2)  that
satises   the   generalized  Nyquist   criterion  are   necessarily  the   eigenfunctions.   The   set   of
eigenvalues  are  are  unique,   and  MM  has   selected  eigenfunctions   corresponding  to  the   N
largest.   A  well-known  property  of  the  eigenvalues  is  that  no  other  set  of   N  has  a  larger
product.   Then,
N
n=1
g
n
  (4.158)
is a maximum.  For any given set of energies c
n
, the product
N
n=1
_
1 +
 c
n
 g
n
_
  (4.159)
331
Figure 4.14:  Parallel Subchannels of Modal Modulation.
332
is also a maximum because  the  function 1 +
  Lngn
  is linear (convex)  in  g
n
.   Then since  this
function is maximized for any set of energies, it is certainly the maximum for the water-lling
set of energies that are known to maximize for any set of g
n
.  Thus, the equivalent-channel
SNR is the highest for any gap, and modal modulation achieves the highest possible SNR of
any  partitioning scheme  on  the  interval  [(T  T
H
)/2, (T  T
H
)/2).   As  T    for  xed
T
H
, then  the length  of the  autocorrelation function for any practical and thus  nite-length
channel becomes negligible, then modal modulation is also optimum over the entire time axis
without restriction on the time interval to be shorter than the symbol period.   QED.
The above theorem essentially states that MM is optimum given observation of a nite time interval
[(T   T
H
)/2, (T   T
H
)/2).   It  is  then  natural   to  expect   the  following lemma  that  states  that  MM
converges to MT modulation in performance as both allow the number of dimensions  N  to go to innity
while also allowing the time interval to span (, ):
Theorem 4.4.3  (Convergence of  MT  to Optimum)  Multitone Modulation converges to
Modal Modulation  as both  N  and  [T/2, T/2) (, ).
Proof:   The  set  of  eigenvalues of  any autocorrelation  function is  unique.   The  set  of eigen-
values essentially determine  the performance  of MM through the  SNR
m,u
  for that channel.
By  simple  substitution  of   
n
(t)  =  e
2n
T
  t
into  the  dening  equation  of  eigenvalues  shows
that  these  exponential  functions  are  a  valid choice  of  eigenfunctions  as  the  symbol  period
becomes innite, [T/2, T/2) (, ), and the corresponding eigenvalues are the values
of  R() at frequencies  of the exponentials.   Thus, the eigenvalues are the values of the  R()
at each frequency.   As  N , these frequencies become innitely dense, and equal to those
of  the  MT  system,   which  then  has  no  ISI  on  any  tone.   The  corresponding  MT  receiver  is
optimum (ML/MAP). Thus,  both systems  have the  same SNR
m,u
  and both have optimum
receivers,  and by Theorem  4.4.2.   The multitone system  is thus  optimum for innite-length
symbol period and an innite number of tones.   QED.
The  implication of the above lemma is that for suciently large  N  and  T, the  designer  can use  either
the  MM  method  or  the  MT  method  and  rest  assured  the  design  is  as  good as  can  be  achieved  on the
linear ISI channel with Gaussian noise.
4.4.3   Limiting  Results
The  spectrum  used  by the  basis  functions  of modal modulation thus  converges  also  to the  set  of used
tones  in  the  innite-dimensional  MT  system.   The  interval  may  be  a  set  of  continuous  frequencies  or
several such sets and is denoted by .   The measure of  is the total bandwidth used
[[ =
_
d   .   (4.160)
An optimum bandwidth 
N
   2 = [
[   .   (4.161)
The data rate is
lim
T
b
T
  .   (4.162)
Continuous frequency can then replace the frequency index  n according to
 = 2    lim
N
n
NT
  ,   n = N/2, ...., N/2   ,   (4.163)
and the width of a tone becomes
d =   lim
N
2
NT
  (4.164)
333
Figure 4.15:  Capacity frequency  interval for 1 + .9D
1
channel.
If 1/T
t
 is suciently large to be at least twice the highest frequency  that could be conceived of for use
on any given band-limited channel, then 
c
n
  .   (4.165)
Note for innite time and frequency as used here, there is no need for complex baseband equivalents and
thus all dimensions are considered real.   The power then becomes
P
x
 =   lim
N
1
N  T
 
N/2
n=N/2
c
n
 =
  1
2
 
_
S
x
()d   .   (4.166)
The water-lling equation then has continuous equivalent
c
n
 +
  
g
n
S
x
() +
  
g()
  =  a constant   (4.167)
subject  to the total power constraint in (4.166) , which has corresponding solution for  S
x
(  > 0 for all
  
  and  S
x
() = 0 at all other frequencies.   The data rate then becomes
R   =
  1
2
_
1
2
 log
2
_
1 +
  S
x
()  g()
_
d   (4.168)
=
  1
2
_
1
2
 log
2
_
  g()
_
d   (4.169)
EXAMPLE  4.4.1  (1 +.9D  channel)  The channel with impulse response  h(t) = sinc(t) +
.9sinc(t  1)  has  the  same  performance  as  the  1 + .9D
1
channel  studied  throughout  this
book, if the transmit lter is
  1
T
 sinc(t/T).   An system with optimized MT basis functions of
innite length would have an optimum bandwidth 
  .181
1.81 + 1.8 cos()
_
d
2
  (4.170)
where  W  is  implicitly in  radians/second  for  this  example.   If   P
x
  =  1  with
  A0
2
  =  .181,  the
integral in (4.170) simplies to
   =
_
  W
0
_
  .181
1.81 + 1.8 cos()
_
d   (4.171)
=   
t
W  .181
_
  2
1.81
2
1.8
2
arctan
_
1.81
2
1.8
2
1.81 + 1.8
  tan
_
W
2
_
__
  (4.172)
334
At the bandedge  W,
t
 =
  .181
1.81 + 1.8 cos(W)
  .   (4.173)
leaving the following transcendental equation to solve by trial and error:
 =
  .181W
1.81 + 1.8 cos(W)
  1.9053 arctan(.0526 tan(W/2))   (4.174)
W  = .88 approximately solves (4.174), and the corresponding value of   is   = 1.33.
The highest data rate with 1/T  = 1 is then
(   =
  2
2
_
  .88
0
1
2
 log
2
_
1.33
.181
(1.81 + 1.8 cos )
_
d   (4.175)
=
  1
2
_
  .88
0
log
2
7.35d +
  1
2
_
  .88
0
log
2
 (1.81 + 1.8 cos ) d   (4.176)
=   1.266 + .284   (4.177)
   1.55bits/second   .   (4.178)
This exceeds  the  1  bit/T transmitted  on this  channel  in the  examples  of Chapter  3.   Later
sections  of   this  chapter   show  that  the  SNR
m,u
  for  this  channel   becomes  8.8  dB  for  large
N,  .4 dB  better  than the  best  MMSE-DFE,  and actually 1.7 dB  better  than  the  precoded
MMSE-DFE. The MT system has no error propagation.
In computing data rates as in water-lling with  > 0 dB, the designer needs to remember that the
gap concept is only accurate when
 
b  1.  Often water-lling may include regions of bandwidth for which
b  in  at  least  those  bands  is  less  than  1.   The  gap  approximation becomes  increasingly  accurate  as  the
gap gets smaller, and indeed is accurate  at all
 
b when the gap is 0 dB, meaning capacity results are all
exact.
Periodic Channels
A channel with periodic  r(t) =  r(t + T) will always have eigenfunctions  
n
(t) =  e
2
T
  nt
for the interval
[T/2, T/2).  Periodic channels do not exist in practice, but designers often use extra bandwidth in the
design  of a transmission problem  to make the  nite-duration  channel appear  as  if it were  periodic.   In
this case, MT and MM would again be the same.
Computing the Eigenfunctions
The  eigenfunctions  can  be  dicult  to  compute  in  closed  form  for  most  channels.   At  a  sampling rate
1/T
t
  suciently high to capture  any signicant channel frequencies,  the eigenfunction equation can be
sampled over the time interval (T  T
H
)/2 = LT
t
  to (T T
H
)/2 = LT
t
  to yield
n
 
_
n
(LT
t
)
n
(LT
t
 + T
t
)
.
.
.
n
(LT
t
)
_
_
  =   T
t
_
r(2LT
t
)   ...   r(0)
r(2LT
t
 + T
t
)   ...   r(T
t
)
.
.
.
  .
.
.
  .
.
.
r(0)   ...   r(2LT
t
)
_
_
_
n
(LT
t
)
n
(LT
t
 + T
t
)
.
.
.
n
(LT
t
)
_
_
(4.179)
n
 
n
  =   R
n
  .   (4.180)
This equation is equivalent to nding the eigenvectors  and eigenvalues of the matrix  R.   The largest  N
eigenvalue magnitudes [
n
[ select the corresponding eigenvectors as the sampled basis functions.   These
basis functions may be interpolated to create eigenfunction approximations.  Clearly, the smaller the  T
t
,
the more accurate  the approximation.
In implementation, Section 4.5 illustrates a yet better way to approximate eigenfunctions in discrete
time, which is known as vector coding.
335
Figure 4.16:  Illustration of guard period.
Figure 4.17:  Filters with multitone.
4.4.4   Time-Domain  Packets  and  Partitioning
Overlap, Excess  Bandwidth, and  the Guard Period
A  realistic  transmission channel  has  most  or  all its  nonzero  impulse  response  conned  to  a nite  time
interval  T
H
.   Thus, convolution of the (necessarily causal) channel impulse response  with nonzero input
basis  functions  existing  over  [0, T)  produces   an  output  with  duration  longer  than  the  symbol   period
T
16
as in Figure 4.16.   The rst  T
H
  seconds  of any symbol are  thus possibly corrupted  by intersymbol
interference from the last symbol if that previous symbol had nonzero transmit energy over (T T
H
, T).
Thus,   the  receiver   then  usually  ignores  the  rst   T
H
  seconds  of   any  symbol.   If   T
H
  <<  T,   then  this
overhead penalty, or excess bandwidth penalty, is small and no equalization need be used.   For a xed
channel   and  thus  xed  T
H
,   as   T   ,   this  excess  bandwidth  becomes  negligible.   This  time  period
where the receiver ignores channel outputs is often called a guard period in multichannel modulation.
For  nite  symbol  period,   the  multichannel  system  then  has  an  excess  bandwidth  given  by  the  factor
 = T
H
/(T T
H
).
4.4.5   Filters  with  Multitone
Multitone partitioning almost satises the Generalized Nyquist criterion with any impulse response  h(t),
but  requires  an  innite-length  basis  function  (t)  =
  1
T
 sinc
_
t
T
_
  and  is  not  physically  realizable  even
when N  is nite.  The orthogonality of subchannels is ensured because the Fourier transforms of the basis
function translates by f  =
  n
T
  do not overlap and thus are trivially orthogonal at the linear time-invariant
channel output.   ISI is nearly averted  by the narrow bands that are nearly at even at channel output.
Thus,  many designers  have  investigated  the  possibility of approximating the  multitone functions  with
16
This  section will  now  begin  to  pursue realizable transmission systems  and  not  theoretical abstractions, thus the  basis
functions will  necessarily have to  be  causal in  the  remainder of  this  section.
336
physically realizable basis functions, as generally depicted in Figure 4.17.  There is no guard band in the
time domain, but instead an excess bandwidth in the frequency domain.
The   basic  idea  is   to  avoid  the  use   of   a  T
H
-second  guard  period  in  the   time  domain  by  instead
having very sharp basis functions that approximate (t) =
  1
T
 sinc
_
t
T
_
.  One method might be to simply
truncate  this  basis  function  to  say   100  symbol  periods,   thus  introducing  a  delay  of  approximately
101T  in realization (and introducing a high complexity), and of course  necessarily another 101T  in the
receiver.   Contrast this with a delay of just  T  with the use of the guard period and one sees the method
is  not  attractive.   However,   shorter   length  approximations  may  suce.   One  method  is  to  introduce
again the raised cosine functions of Chapter 3 with excess bandwidth of 10 to 20%.  Sometimes this level
of  excess  bandwidth  can  be  less  for  certain  types  of  channels  (or  noise-equivalent  channels)  that  have
sharp band-edges, notches, or narrow-band noises, all of which result in long impulse responses and thus
large  T
h
  in  the  guard  band  of  Section  4.4.4.   Other  functions  may  also achieve  the  same  time  of  low
side bands for the Fourier transform of the  basis function.   Of course,  any such realization requires  in
reality some type of over-sampled digital signal processing.   Section ??, Subsection 4.6.8 studies discrete
realizations of a few methods that have been proposed to date.
337
Figure 4.18:  Digital realization of channel partitioning.
4.5   Discrete-time  channel  partitioning
While MM restricts  attention to  a nite time interval for construction  of the  transmit basis functions,
these functions remain continuous time and a function of the channel.   Such functions would be dicult
to  implement  exactly  in  practice.   Discrete-time  channel   partitioning  partitions  a  discrete-time
description  of   the  channel.   This  description  is  a  linear  matrix  relationship  between  a  nite  number
of   output   samples  (presumably  sampled  at  a  rate  1/T
t
  exceeding  twice  the  highest  frequency  to  be
transmitted)  and  a  corresponding  nite  set  of  channel  input  samples  at  the  same  rate  that  constitute
the transmitted symbol.  Such a description can never be exact in practice, but with a sucient number
of input and output  samples included, this description  can be  very  close to the  exact  action of the  ISI
channel.   Figure 4.18 illustrates general N-dimensional discrete-time channel partitioning.  Discrete-time
basis  vectors,  m
n
  replace  the  continuous  basis  functions  of  MM  or  MT.  These  basis  vectors  have  a
nite  length  of  N +   samples  over  a duration  of time  T.   The  sampled  pulse  response,   p(kT
t
),  of the
channel  is  assumed  to  span  a  time  interval of  less  than   + 1  sample  periods,   where  ( + 1)T
t
  =  T
H
.
Each basis vector,  m
n
, multiplies a sub-symbol element  X
n
  before being summed to form the transmit
symbol  vector   x.
17
Then,   x  is  output  through  a  digital-to-analog converter.   Some  transmit  ltering
of  the  output  DAC  occurs  (typically analog lowpass  anti-aliasing) before  the  signals pass  through  the
ISI channel.   The combined post-DAC ltering, channel impulse response,  and pre-ADC ltering in the
receiver form a pulse response  p(t).  The sampling rate 1/T
t
 = (N+)/T  is higher than twice the highest
frequency  that  the  designer  hopes  to  use  for the  transmitted  signals.   The  modulator attempts  to  use
basis vectors,  m
n
, that will remain orthogonal after undergoing the dispersive eect of the ISI channel,
just as orthogonality was maintained with MM.
The receiver low-pass lters with gain 1/
T
t
 (that is also included in the pulse response) to maintain
noise per dimension at
  A0
2
  ), and then samples at rate 1/T
t
 the received modulated signal.  The matched
lters are discrete-time  and also nite length.   The matched lters are only complex if the  entire set  of
modulated signals had been passband modulated by an external passband modulator that is not shown
in Figure 4.18, and would in this case correspond to a complex baseband-equivalent channel.   There are
N  +   input  samples  that  lead  to  N  matched-lter  output  samples,  with    samples  lost  to  the  guard
period to avoid intersymbol interference.   The samples can be  real or complex, which in the latter case
17
Xn  (upper case) is  used to avoid confusion with  x
k
,  a  transmitted sample that is  the direct input to  the discrete-time
channel.
338
tacitly will imply a doubling of the number of real dimensions.
4.5.1   Optimal  Partitioning  Vectors  -  Vector  Coding
The  last  N  ADC  channel-output symbol samples  in Figure 4.18 have vector  representation,  with time
indexed within a symbol from  k =   to  N 1,
_
_
y
N1
y
N2
.
.
.
y
0
_
_
 =
_
_
p
0
  p
1
  ...   p
  0   ...   0
0   p
0
.
.
.   p
1
  p
.
.
.   0
0
  .
.
.
  .
.
.
  .
.
.
  .
.
.
  .
.
.   0
0   ...   0   p
0
  p
1
  ...   p
_
_
_
x
N1
.
.
.
x
0
x
1
.
.
.
x
_
+
_
_
n
N1
n
N1
.
.
.
n
0
_
_
  ,   (4.181)
or more compactly,
y = Px + n   .   (4.182)
The  N  (N + ) matrix  P  has a singular value  decomposition
P  = F
_
 .
.
. 0
N,
_
M
  ,   (4.183)
where  F  is  a  N  N  unitary (FF
= F
  =
M
M  = I)matrix, and 0
N,
  is an  N  matrix of zeros.    is an  N N  diagonal matrix with singular
values  
n
,  n = 1, ..., N  along the  diagonal.   The  vector  x is a data symbol vector.   The  notational use
of  P  for the channel matrix suggests  that any anti-alias analog lters  at transmitter  and receiver  have
been convolved with the channel impulse response  h(t) and included in the discrete-time response of the
matrix channel.
Vector Coding creates a set of N  parallel independent channels by using the transmit basis vectors,
m, that are  the  rst  N  columns of  M  - in other  words, the  transmit vector  x  is obtained from the  N
vector components  X
n
,  n = 1, ..., N  according to
x = M
_
_
X
0
.
.
.
0
_
_
= [m
N1
  m
N2
  ... m
1
m
0
  ... m
]
_
_
X
N1
X
N2
.
.
.
X
0
0
.
.
.
0
_
_
=
N1
n=0
X
n
m
n
  .   (4.184)
4.5.2   Creating  the  set  of  Parallel  Channels  for  Vector  Coding
The  last    columns of  M  that  do not  aect  P  because  they  are  multiplied by zero  in (4.184), these  
columns  cannot  contribute  to  the  channel   output  and  can  be  ignored  in  implementation.   The  corre-
sponding vector-coding receiver uses discrete  matched lters as the rows of  F
, forming
Y  = F
y =
_
_
f
N1
y
...
f
0
y
_
_
  .   (4.185)
The mathematical description of the resultant  N  parallel channels is
Y  = X +N   ,   (4.186)
or, for each independent  channel or entry in  Y ,
Y
n
 = 
n
 X
n
 + N
n
  .   (4.187)
339
Since  F  is unitary, the noise vector  N  is also additive white Gaussian with variance per real dimension
identical to that of  n, or
  A0
2
  .
Theorem 4.5.1  (Optimality of  Vector Coding)  Vector Coding has the maximum SNR
m,u
for any  discrete channel  partitioning.
Proof:   The proof of this theorem is deferred  to Chapter 5 (Lemma 5.4.1) where  some not-
yet-introduced concepts will make it trivial.
The  optimality of   VC  is  only  strictly  true  when  capacity-achieving  codes   with  Gaussian  distri-
butions  are  used  on  each  of   the  subchannels.   The  restriction  to  independent   symbols,   requiring  the
receiver to ignore the rst    samples of any symbol also restricts  the optimality. If this restriction were
relaxed or omitted, then it is possible that a better design exists.   For N  >> , such a potentially better
method would oer only small improvement, so only the case of    > .1N  would be of interest in further
investigation of this restriction (see Section 5.7).
Comment on  correlated noise:
In  the  discrete-time  transmission  system  of  Figure  4.18, the  noise  is  unlikely to  be  white.   Section  1.7
previously indicated how to convert any channel into an equivalent white-noise channel, which involved
receiver preprocessing by a canonical noise-whitening lter and the equivalent channel frequency response
becoming the  ratio of the  channel transfer  function to the  square-root  of the  power-spectral  density  of
the noise.   Since  this lter  may be  dicult to realize in practice,  then the  discrete-time  processing  also
described  in Section 1.7 of factoring the noise covariance to
E[nn
] = R
nn
2
= R
1/2
nn
 R
1/2
nn
  
2
,   (4.188)
can be used.   Then a discrete white-noise equivalent channel becomes
y R
1/2
nn
  y =
_
R
1/2
nn
  P
_
x +  n   .   (4.189)
The extra matrix multiply is absorbed into the receiver processing as F
y F
R
1/2
nn   y.   The preceding
development  the  continues  with  P  replaced  by the  matrix  R
1/2
nn
  P,  the  later  of which  then  undergoes
SVD. The  matrix  P  will not  be  Toeplitz any  longer, but  will be  very  close  to Toeplitz if the  duration
of the  noise  whitening lter  is  much less  than the  symbol period.   In  any case,  vector  coding  does  not
require  P  to be Toeplitz.
4.5.3   An  SNR  for  vector  coding.
The  number  of  bits/symbol  for  vector  coding  is  achieved  on  the  channel  using  a  coding  method  with
constant gap  (regardless of  b
n
) on all subchannels.   This  b is
b = (1/k)
N
n=1
log
2
_
1 +
 SNR
n
_
  ,   (4.190)
where  k  = 2  for real  subchannels  and  k  = 1  for  complex subchannels.   When   = 1,  the  data  rate  on
each  subchannel   is  as  high  as  possible  with  the  given  energy c
n
  and  is  sometimes  called  the  mutual
information.  (See  Chapter  5  for more on mutual information).   If the  water-ll energy  distribution is
also used, then  b is the capacity b = c or highest possible number of bits per channel that can be reliably
transmitted on the discrete-time channel.   The best energy distribution for vector coding and  > 1 is a
water-ll with SNR
n
 SNR
n
/, as in Section 4.3:
c
n
 +
A0
2
   
2
n
=   K   (4.191)
N
n=1
c
n
  =   (N +)
c
x
  (4.192)
340
The SNR for vector coding is then
SNR
vc
 =  
_
N
n=1
_
1 +
 SNR
n
_
_1/(N+)
    .   (4.193)
Note that (N +)
c
x
 is now the total energy.   The exponent depending on  N + is used because vector
coding uses  N +  channel input samples
18
.   When  = 1, for complex passband systems,
b =
  b
2(N + )
  = .5 log
2
_
1 +
 SNR
vc
_
  ,   (4.194)
and
b =
  b
(N +)
  = .5 log
2
_
1 +
 SNR
vc
_
  (4.195)
for real (PAM) baseband systems.   Thus, the geometric SNR of vector coding can be compared against
the detection SNR of an AWGN, a MMSE-LE, or a MMSE-DFE because  SNR
vc
  has the same relation
to achievable data rate with the same class of modulation codes of constant gap .   Further, we have the
interesting result
SNR
vc
 = 2
2
I
 1   ,   (4.196)
and
SNR
vc
 = 2
2
 
C
1   (4.197)
when water-lling is used with  = 1.   Thus the SNR is maximum when  the data rate is maximized at
capacity using water-lling.
EXAMPLE  4.5.1  (Vector Coding  for 1 +.9D
1
Channel)  For  this  channel,   a  guard
period of one sample   = 1 is sucient to prevent overlap between symbol transmissions at
the channel output.   We will choose  N  = 8 to be consistent with previous invocations of this
channel.   The total energy is then c = (8 + 1)  1 = 9.   This energy may be distributed over a
maximum of 8 subchannels.   The channel-description matrix is
p =
_
_
.9   1   0   0   0   0   0   0   0
0   .9   1   0   0   0   0   0   0
0   0   .9   1   0   0   0   0   0
.
.
.
  .
.
.
  .
.
.
  .
.
.
  .
.
.
  .
.
.
  .
.
.
  .
.
.
  .
.
.
0   0   0   0   0   0   0   .9   1
_
_
.   (4.198)
Singular value  decomposition  (implemented  by  the  svd  command  in  matlab) produces  the
following 8 nonzero singular values
[1.87 1.78 1.64 1.45 1.22 .95 .66 .34]   ,   (4.199)
and corresponding subchannel SNRs
g
n
 =
  
2
n
2
  = [19.3 17.6 15.0 11.7 8.3 5.0 2.4 .66]   .   (4.200)
Execution  of water-lling with  = 0 dB  on the  subchannels  nds  that  only 7 can  be  used
and that
K =
  1
7
_
9 +
6
n=0
1
g
n
_
 = 1.43   .   (4.201)
18
The  number  of  samples  is  N  +   dimensions  for  real  channels, but  2N  + 2  dimensions for  complex channels.   In  the
complex case, each SNR  factor would be squared and then (4.193) holds in either the complex or real case without further
modication
341
The corresponding subchannel energies are:
[1.38 1.37 1.36 1.34 1.30 1.23 1.01 0]   ,   (4.202)
and the SNRs are
[26.6 24.2 20.4 15.8 10.8 6.2 2.4 0]   ,   (4.203)
resulting in an SNR of
SNR
V C
  =
_
  6
n=0
SNR
n
 + 1
_
1/9
1 = 6.46 = 8.1 dB  .   (4.204)
The capacity for  N  = 8 and   = 1 is then
 c =
  1
9
6
n=0
1
2
 log
2
(1 + SNR
n
) = 1.45bits/dim.   (4.205)
The capacity is close to the true capacity of 1.55 bits/dimension, which would be determined
by executing this example as  N .  Note this set of subchannels are exactly independent
and  the  SNR  and  capacity  are  exact  -  no  approximations of  no-ISI  on  a  subchannel   were
made.  The SNR is then 8.8 dB and exceeds the previous best equalized SNR on this channel,
achieved  by  the  MMSE-DFE.  And,  there  is  no  error  propagation.   The  overhead  of     =  1
becomes zero as  N  increases.
342
4.6   Discrete  Multitone  (DMT)  and  OFDM  Partitioning
Discrete MultiTone (DMT) and Orthogonal Frequency Division  Multiplexing  (OFDM) are
two very common forms of Vector Coding that add a restriction to reduce complexity of implementation.
Both DMT and OFDM have the same channel partitioning  they dier in the loading algorithm: OFDM
puts  equal  bits  on  all  subchannels,   rather  than  optimizing  b
n
  and c
n
,  as  in  DMT.  OFDM  is  used  on
broadcast (i.e., one-way channels) for which the receiver cannot tell the transmitter the bits and energies
that  are  best.   DMT  is  used  on  slowly  time-varying two-way channels,  like  telephone  lines.   OFDM  is
used  in  wireless  time-varying  channels   with  codes   that  allow  recovery  of   lost  subchannels  caused  by
time-varying notches in multipath intersymbol interference.
The DMT/OFDM channel partitioning forces the transmit symbol to have x
k
 = x
Nk
 for k = 1, ....
Such repeating of the last  samples at the beginning of the symbol is called a cyclic prex.  Tremendous
processing simplication occurs  with the cyclic prex.
With cyclic prex, the matrix description of the channel has a square  N N  circulant  P  matrix:
_
_
y
N1
y
N1
.
.
.
y
0
_
_
  =
_
_
p
0
  p
1
  ...   p
  0   ...   0
0   p
0
.
.
.   p
1
  p
.
.
.   0
0
  .
.
.
  .
.
.
  .
.
.
  .
.
.
  .
.
.   0
0   ...   0   p
0
  p
1
  ...   p
  0   ...   0   p
0
  ...   p
1
.
.
.
  .
.
.
  .
.
.
  .
.
.
  .
.
.
  .
.
.
  .
.
.
p
1
  ...   p
  0   ...   0   p
0
_
_
_
_
x
N1
.
.
.
x
0
_
_ +
_
_
n
N1
n
N1
.
.
.
n
0
_
_
  (4.206)
y   =
  
Px +n   .   (4.207)
Singular  value  decomposition  produces   unique  singular  values  if   those  values  are  restricted  to  be
nonnegative real values (even when  P  is complex).   For transmission, this restriction is superuous  and
a variant of SVD will be simpler to compute.   For the case of the cyclic prex, the SVD may be replaced
by the eigendecomposition or spectral factorization of the circulant matrix  P,
P  = MM
  (4.208)
where   MM
= M
_
X
N1
X
N2
.
.
.
X
0
_
_
  (4.209)
19
A  caution to  the reader using Matlab for eigenvalue/vector computation   the eigenvalues are  not ususally ordered as
they are  in  singular-value decomposition.
343
where
X
n
 =
  1
N
N1
k=0
x
k
e
2
N
  kn
   n  [0, N 1]   ,   (4.210)
and
x =
_
_
x
N1
x
N2
.
.
.
x
0
_
_
  .   (4.211)
The corresponding Inverse Discrete Transform (IDFT) is computed by
x
k
 =
  1
N
N1
n=0
X
n
e
2
N
  kn
   k  [0, N 1]   .   (4.212)
In matrix form, the DFT and  IDFT are
X   =   Qx   (4.213)
x   =   Q
X   ,   (4.214)
where Q is the orthonormal matrix given by
Q =
  1
N
_
_
e
2
N
  (N1)(N1)
...   e
2
N
  2(N1)
e
2
N
  (N1)
1
e
2
N
  (N1)(N2)
...   e
2
N
  2(N2)
e
2
N
  (N2)
1
.
.
.
  .
.
.
  .
.
.
  .
.
.
  .
.
.
e
2
N
  (N1)
...   e
2
N
  2
e
2
N
  1
1   ...   1   1   1
_
_
.   (4.215)
Both x and X can be presumed to be one period of a periodic sequence to generate DFT/IDFT
values for indices k and/or n outside the interval (0, N1).   The matrix Q
 = [q
N1
  , ...,   q
0
]   ,   (4.216)
which are useful in the following theorems proof.
The  IDFT,  X
n
, and DFT,  x
k
, values can be  complex.   For real time-domain signals, the restriction
X
Nn
  =  X
n
  for   n  =  1, ..., N  1  holds.   This  restriction  implies  that  there  are
  
N  =  N/2  complex
dimensions  when  N  is even
20
.   The  IDFT  generally corresponds  to  N  input  samples  whether  complex
or real, while this textbook has previously used  N  to be the number of real dimensions and
  
N  to be the
number of tones.   Thus, there is a slight inconsistency in notation.  This inconsistency is resolved simply
by considering N  in the IDFT and DFT denitions to be the number of tones used if complex, and twice
the  number  of  tones  with the  congjugate symmetry  on  the  IDFT  input enforced  (or  real  time-domain
signals).   The  FFT  size  should  then  be  equal  to  the  number  of  real  dimensions only  in the  latter  case
of  real  baseband  signals.   When  complex  baseband  signals  are  used,   the  number  of  real  dimensions  is
twice the FFT size.   With this understanding, matlab can be used to easily generate  the Q matrix here
exactly according to the commands
J=hankel([  zeros(1,N-1)  1])
Q=(1/sqrt(N))*J*fft(J)
20
Actually, there are
  
N 1  complex dimensions and two  real dimensions corresponding to  n = 0  and n =
  
N.   The  extra
two  real   dimensions  are  often  grouped  together  and  considered  to  be  one  complex  dimension,   with  possibly  unequal
variances in  the  2  dimensions.   Usually, practical systems  use  neither of  the  real  dimensions, leaving
  
N 1  subchannels
344
Lemma 4.6.1  (DFT is  M  for circularly  prexed VC)  The circulant matrix P  = MM
n
q
n
 = Pq
n
  ,   (4.217)
which has matrix equivalent Q
= PQ
_
p
0
 + p
1
 e
2
N
  n
+ ... + p
N1
 e
2(N1)
N
  n
_
 = P
n
  .   (4.218)
The next row up is then
1
_
p
0
 e
2
N
  n
+p
1
 +... + p
N1
 e
2(N2)
N
  n
_
= P
n
 e
+
2
N
  n
.   (4.219)
Clearly, repeating this procedure, one nds
Pq
n
 = P
n
q
n
  ,   (4.220)
making  q
n
  an eigenvector of  P.   Thus, a choice of the eigenvectors  is simply the  columns of
the IDFT matrix Q
.  For this choice, the eigenvalues must exactly be the DFT values of the
rows of  P.   QED.
4.6.2   DMT
Discrete  Multi-tone  (DMT) transmission uses  the  cyclic prex and  M  =  FQ
_
.9   1   0   0   0   0   0   0
0   .9   1   0   0   0   0   0
0   0   .9   1   0   0   0   0
.
.
.
  .
.
.
  .
.
.
  .
.
.
  .
.
.
  .
.
.
  .
.
.
  .
.
.
1   0   0   0   0   0   0   .9
_
_
.   (4.221)
Table 4.6.1 summarizes  the  results  of  waterlling with  8  units  of  energy  (not  9,  because  1
unit is wasted in the cyclic prex on average).   Only 7 of the 8 subchannels were used - note
that 6 of the subchannels are eectively two-dimensional QAM subchannels.
The SNR is
SNR
DMT
  =
_
  6
n=0
(1 + SNR
n
)
_
1/9
 1 = 7.6 dB <  9.2 dB from Example 4.1.1.   (4.222)
346
The DMT SNR is less  than  that found earlier for MT, but is exact  and not  approximated.
It is also slightly worse than the  SNR of vector  coding for the  same number of dimensions.
However, DMT can increase the number of dimensions signicantly before the complexity of
FFTs exceed that of the orthogonal matrix multiplies.  One nds that for  N  = 16, this SNR
increases to its maximum value of 8.8 dB.
Let us compare the complexity of this DMT system with its exact SNR now correctly  com-
puted  with  that  of   the  DFE  studied  earlier  and  also  against  the  VC  system.   To  get  7.6
dB, the  MMSE-DFE  required  3  feedforward  taps  and 1  feedback  tap  for a  complexity of 4
multiply-accummulates per sample.   VC requires 7(8)/9 multiply-accumulates in the receiver
to get the slightly higher SNR of 8.1 dB, or a complexity 6.2/sample.  DMT requires  a size-
8  FFT,  which  nearly  exactly  requires  8 log
2
(8)  multiply accumulates  when  the  real-output
constraint is exploited.  DMT then gets 7.6 dB also, but with 2.7 multiplies/sample. A factor
of 2-3 improvement in complexity for a given level of performance with respect  to the DFE
and  VC  is  common.   For  the  size  16  DMT  with  SNR=8.8  dB,  the  complexity  is  still  only
3.8/sample while for instance a DFE requires 10 feedforward taps and 1 feedback to get only
8.4 dB, then requiring 11/sample.
Matlab  programs
The following matlab functions have been written by teaching assistants Atul Salvekar, Wonjong Rhee,
and  Seong-Taek Chung, and  are  useful  for computing DMT  performance  and  parameters  quickly.   See
class web site for complete programs.
---------  rate-adaptive  water-fill  DMT  ------------------------------------
function  [gn,en_bar,bn_bar,Nstar,b_bar]=waterfill(P,SNRmfb,Ex_bar,Ntot,gap)
%   Written  by  Atul  Salvekar,  Edited  by  Wonjong  Rhee
%
%   function  [gn,en_bar,bn_bar,Nstar,b_bar]=waterfill(P,SNRmfb,Ex_bar,N,gap)
%
%   P  is   the  pulse  response
%   SNRmfb  is   the  SNRmfb  in   dB  [Ebar  ||p||^2  /   (N0/2)]
%   Ex_bar  is   the  normalized  energy
%   Ntot  is  the  total  number  of   real/complex  subchannels,  Ntot>2
%   gap  is  the  gap  in   dB
%
%   gn  is  channel  gain
%   en_bar  is   the  energy/dim  in   the  nth  subchannel
%   bn_bar  is   the  bit/dim  in  the   nth  subchannel
%   Nstar  is   the  number  of   subchannel  used
%   b_bar  is   the  bit  rate
-------------------------------------------------------------------------------
---------  margin-adaptive  water-fill  DMT   ------------------------------------
function  [gn,en_bar,bn_bar,Nstar,b_bar_check,margin]=MAwaterfill(P,SNRmfb,Ex_bar,b_bar,Ntot,gap)
%   Written  by  Seong  Taek  Chung
%
%   P  is   the  pulse  response
%   SNRmfb  is   the  SNRmfb  in   dB
%   b_bar  is   the  normalized  bit  rate
%   Ex_bar  is   the  normalized  energy
%   Ntot  is  the  total  number  of   real/complex  subchannels,  Ntot>2
347
%   gap  is  the  gap  in   dB
%
%   gn  is  channel  gain
%   en_bar  is   the  energy/dim  in   the  nth  subchannel
%   bn_bar  is   the  bit/dim  in  the   nth  subchannel
%   Nstar  is   the  number  of   subchannel  used
-------------------------------------------------------------------------------------------------
---------  rate-adaptive  Levin-Campello  DMT  ------------------------------------
function  [gn,En,bn,b_bar]=LC(P,SNRmfb,Ex_bar,Ntot,gap)
%
%   Taek  Chung
%
%   Levin  Campellos  Method
%
%   P  is   the  pulse  response
%   SNRmfb  is   the  SNRmfb  in   dB
%   Ex_bar  is   the  normalized  energy
%   Ntot  is  the  total  number  of   real/complex  subchannels,  Ntot>2
%   gap  is  the  gap  in   dB
%
%   gn  is  channel  gain
%   En  is  the   energy  in  the   nth  subchannel  (PAM  or   QAM)
%   bn  is  the   bit  in  the  nth  subchannel  (PAM  or  QAM)
%   Nstar  is   the  number  of   subchannel  used
%   b_bar  is   the  bit  rate
%
%   The  first  bin  and  the  last  bin  is   PAM,  the  rest  of  them  are  QAM.
---------------------------------------------------------------------------------
---------  margin-adaptive  Levin-Campello  DMT  ------------------------------------
function  [gn,En,bn,b_bar_check,margin]=MALC(P,SNRmfb,Ex_bar,b_bar,Ntot,gap)
%
%   Taek  Chung
%
%   Levin  Campellos  Method
%
%   P  is   the  pulse  response
%   SNRmfb  is   the  SNRmfb  in   dB
%   Ex_bar  is   the  normalized  energy
%   Ntot  is  the  total  number  of   real/complex  subchannels,  Ntot>2
%   gap  is  the  gap  in   dB
%   b_bar  is   the  bit  rate
%   gn  is  channel  gain
%   En  is  the   energy  in  the   nth  subchannel  (PAM  or   QAM)
%   bn  is  the   bit  in  the  nth  subchannel  (PAM  or  QAM)
%   b_bar_check  is  the   bit  rate  for  checking  -   this  should  be   equal  to   b_bar
%   margin  is   the  margin
%   The  first  bin  and  the  last  bin  is   PAM,  the  rest  of  them  are  QAM.
---------------------------------------------------------------------------------
348
Figure 4.20:  ADSL use example.
EXAMPLE  4.6.2  (ADSL)  By  far   the   most   heavily  used  broadband  internet   delivery
method  in  the  world  is  asymmetric  digital  subscriber   lines  (ADSL)  in  Figure  4.20,  allow-
ing  nominally  1.5  Mbps  (range  is  500  kbps  to  12  Mbps)   to  customers   on  telephone  lines
downstream and roughly about 1/3 to 1/10 that rate from the customer  to the telephone
company upstream (where it is sent to the internet service provider).
ADSL is world-wide standardized in ITU Standard G.992.1 and uses DMT. The symbol rate
is  T  =  250s.   The  downstream  modulation  uses  256  4.3125 kHz  wide  subchannels  and  a
sampling rate  of  1/T
t
=2.208 MHz.   The  cyclic  prex  is  then  necessarily  40  samples.   Each
time-domain symbol contains then 512+40 or 552 samples.   Hermetian symmetry is used to
create a real signal for transmission over the band from 0 to 1.104 MHz.  Typically, the rst 2-3
tones near DC and DC are not used to prevent interference into voiceband telephony (POTS
=  plain  old  telephone  service),   which  shares  the  same  telephone  line  as  ADSL.  Upstream
transmission uses  instead  32 tones  to frequency  138  kHz.   Tone  256 is also  not  used.   Tone
64 (276 kHz)  is reserved  for pilot signal (known point in 4 QAM sent  continuously) that is
used to recover the symbol and sampling rates.   The sampling rate upstream is 276 kHz and
the  cyclic  prex  is  5 samples  for  a  block  size  of 69  samples.   Upstream  is  then  exactly  1/8
downstream.   Downstream  tones  may or  may  not  share  the  upstream  band  (when  they  do
share, a device called an echo canceller is used to separate the two directions of transmission).
Usually, the upstream and downstream bands are placed for further adaptive renement by
loading as  shown  in  Figure  4.21.   Some  typical  ADSL  channels  (taken  from  the  American
National Standards Institute) appear in Figure 4.22.  The lines have signicant variation with
length, and also with impairments like bridged-taps  (open circuited  extension  phones)  that
cause  the  ripple  in  the  characteristics  in  the  second  graph  in  Figure  4.22.   Also  noises  are
highly frequency selective  and could be nearby radio stations captured by the phone-line or
other  DSLs  with varying bandwidths  that  are  suciently  close  to  crosstalk  into the  ADSL
line.  All this channel variation among possible lines for deployment of ADSL creates  a huge
349
Figure 4.21:  ADSL spectrum use - further adaption by loading within each band.
Figure 4.22:  ADSL example lines.
350
need for adaptive loading, which will optimize the DMT performance for each situation.
A maximum power of 20.5 dBm is permitted in ADSL downstream and 14.5 dBm upstream.
The maximum number of bits permitted to be loaded on to any single tone is b
n
  15.  While
antognists once said ADSL would fail as being too complex to implement, components that
implement  a  complete  modem  (analog  and  digital)  now  sell   for  less  than  $10  and  over  30
million customers will be broadband on-line with ADSL by the end of 2002.  That number will
go to 100s of millions in the coming years.   While the design may be conceptually complex,
most customers simply call the phone company, procure a modem, plug and play in less than
15 minutes  a testimonial to the engineer   a conceptually complex design is acceptable  if
it keeps the customer use very simple (the opposite is almost never true) and the cost down.
Wireless transmission often uses OFDM transmission because no optimization at the transmitter of
b
n
  is  possible  if  the  channel  varies  too  rapidly with  time.   Instead,  over  a  wide  bandwidth  in  wireless
transmission, multipath ISI typically results in notches in the transmission band.  Part of the signal (some
tones) is lost.  Thus, with coding methods not discussed until later chapters of this book, it is possible to
recover the information of several lost tones as long as the code is suciently powerful, no matter where
those lost tones are located (that is, no matter which tone indices n are aected).   Also, in broadcast
wireless  transmission,  there  may be  several  dierent  channels  to  each  customer  of a  service,  and  no
feedback channel, so then the  b
n
  can not be optimized as a function of the identied channel (which
necessary  must occur at the receiver, which is the only place the channel can be known practically and
estimated).   The use of OFDM in wireless transmission with codes  is often called COFDM. COFDM is
the most powerful method for handling the time-varying channel, even though OFDM might have been
greatly inferior to  DMT if the  channel could  have been  known at the  transmitter.   COFDMs  greatest
asset  is simply the  practical advantage that the  transmitter is basically xed (an IFFT operation with
xed bit loading and energy on all tones)  actually the gain and phase of each subchannel is estimated
and  used,   but  only  at  the  receiver.   Rapid  determination  of   a  full  DFE-like  equalizer  is  dicult  and
loses performance greatly in wireless transmission, leaving multicarrier a superior method coincidentally
for wireless  transmission as  well as wireline.   However,  the  key is  the  additional code,  as  nominally an
un-loaded or uncoded  OFDM signal will not be very eective.   This heuristic explanation then  leads
to the next couple of examples of OFDM partitioning use.
EXAMPLE  4.6.3  (802.11a  Wireless LAN)  The IEEE has standardized a wireless local-
area-network known as IEEE 802.11(a). This method transmits up to 54 Mbps symmetrically
between wireless users in relative close proximity (nominally, within a building or complex),
nominally of 100 meters or less with one of 3 transmit power levels (assuming 6 dB antenna
gain) of  16,  23,  or  29  dBm.   The  system  is  complex  baseband  (unlike ADSL,  which  is  real
baseband)  with
  
N  = 64  (so  up  to  128 real  dimensions).   The  carrier  frequencies  are  in  the
5 GHz range  and are  exactly (in MHz)  5180 + i(20) (16 dBm)  , 5260 + i(20) (23 dBm), or
5745 + i(20) (29 dBm) where  i = 0, 1, 2, 3.
At   baseband,   the   IEEE  802.11(a)  system  numbers   the   tones   from  -31  to  +31  (the   64th
carrier at the band edge is not used).   The sampling frequency is exactly 20 MHz (complex,
so two 20 MHz real converters conceptually) and the carrier spacing is then 20MHz/64=312.5
kHz.   The  cyclic prex has 16 complex samples, and thus the  symbol consists  of 64+16=80
samples.   The symbol rate is then 250 kHz (or  T  = 4s, while T
t
 = 50ns).   The guard period
(or maximum length of ISI for zero inter-tone distortion) is 0.8s.
Of the  63 possible carriers  cited, the carrier  in the middle of the  band carries  a xed  phase
(not data) for carrier recovery.   Also, tones -31 to -27 and 27 to 31 at the band edges are not
used to allow analog-lter separation of adjacent carrier bands of dierent 802.11(a) signals.
Furthermore, carriers, -21, -7, 7, and 21 are pilots for symbol-timing recovery.   This leaves 48
tones for carrying data.   The total bandwidth used is 15.5625 MHz.   Data rates with integer
numbers of bits per tone will thus be multiples of
R = k(1 bit /2 dimensions)  (48 tones)  250kHz or 6  kMbps.   (4.223)
351
R (Mbps)   constellation   code rate   b
n
  
b
n
  b
6   BPSK   1/2   1/2   1/4   24
9   BPSK   3/4   3/4   3/8   36
12   4QAM   1/2   1   1/2   48
18   4QAM   3/4   3/2   3/4   72
24   16QAM   1/2   2   1   96
36   16QAM   3/4   3/2   3/4   144
48   64QAM   1/2   3   3/2   192
54   64QAM   3/4   9/2   9/4   216
Table 4.4:  Table of IEEE802.11(a) data rates and associated parameters.
Table 4.4 summarizes the options for transmission.   The coding rate is a fraction of the actual
bits that carry information while the remaining bits are redundant and used for overhead as
described in Chapters 10 and 11 (the extra bits help reduce  the gap).
Table 4.4 also  shows  that  while  all (used)  tones  carry  the  same  b
n
  =  b/48, no  tone  carries
the  full number  of  messages  possible  in  the  QAM  constellation  because  of the  coding used
  without  this  coding,  OFDM  would  not  be  a  wise  transmission  method  for  wireless.   The
spectrum mask of the composite analog signal uses at power spectral density (power divided
by 16 MHz) over the used band, and must be 20 dB attenuated at frequencies  further than
11 MHz from carrier, 28dB attenuated at 20 MHz from carrier, and 40dB attenuated at
 30 MHz from carrier.
Two new IEEE standards IEEE 802.11(g) and IEEE802.11(aw) are appearing shortly where
(g) allows carrier frequencies  near 2.4 GHz but otherwise looks like (a) and (aw) allows use
of multiple input and output antennas with (a) to create several spatial paths between users
and  allow  between  108  Mbps  and  600  Mbps  (perhaps  1  Gbps),  and  possibly  simultaneous
use   of   multiple  carrier   frequency  20  MHz  slots   by  a  single  user/network.   The   potential
multiple  antenna  use  being  suggested  by  this  instructor/author   in  a  1994  version  of   this
class (then called EE479) and then successfully pursued by several former students.   Chapter
5  illustrates  a  general   theory  of   how  to  generate  the  spatial   or  extra  channels   in  various
multiple-input/multiple-output channels.
There   is  very  little  adaptive  spectra  in  any  of   the  IEEE  802.11  standards,   other   than  a
selection of data rate.  The future may see uplink power control used to at least set the power
levels of signals to the base station so that sidebands of a nearby signal do not unduly disrupt
the main spectrum of a distant signal.
EXAMPLE  4.6.4  (Digital  Video Broadcast)  Digital terrestrial television broadcast has
been very successful outside the USA (where analog still prevails outside of cable systems and
satellite  broadcast).   This  successful   system  is  known  generally  as  Digital Video  Broadcast
(DVB). A  reason  for  the  success  elsewhere  is  the  use  of a COFDM  system  with something
called  SFN  (single  frequency  network),   as   illustrated  in  Figure  4.23.   SFN  basically  uses
multiple broadcast  sites  to  deliver  exactly  the  same  signal  to  customers  so  that  buildings,
structures,   and hills blocking one path to customer  likely do not block another  path.   How-
ever, when more than one signal reaches the customer, likely with dierent delays, the eect
is  equivalent  to  intersymbol   interference  or  multipath  distortion.   This  delay  can  be  quite
large (10s of  s)  if transmitters  are say 10s of km apart.   Over  the  6-7.6 MHz  wide bands
of Television, such delay would overwhelm even the best of Decision Feedback Equalizers as
there could be 100s of deep notches in the transmission band, leading to huge length lters
and  numerical/training  problems.   However,   for  COFDM,   codes  are  used  to  mitigate  any
loss of tones at frequencies  disabled by such notches.   The SFN basically solves one problem
(blocking of signal), but forces the use of multicarrier or OFDM.
352
Figure 4.23:  Single Frequency Network example for television-broadcast channel partitioning.
The  DVB  system  is  also  complex-baseband  and  uses   conventional   TV  carrier   frequencies
(basically above  50 MHz  and  7.61 MHz  spacing  outside  US).  The  DVB  system  uses  either
N=2048 or 8192 tones, of which either 1705 or 6817 are used.   The number of pilots used for
recovery of symbol rate is 45 or 177, leaving 1650 or 6650 tones available for carrying infor-
mation.  The exact parameters depend on the cyclic prex length, which itself is programmed
by the broadcaster  in DVB to be  one of 1/4, 1/8, 1/16, or 1/32 of the symbol length.   The
tone width is xed at approximately 1.116 kHz (896s) for 8192 tones and 4464 kHz (224s)
for 2048 tones.   The symbol period  T  thus varies with cyclic prex and can be any of 1120s,
1008s,  952s,  or  924s  for  8192 tones  and  any  of 280s,  252s,  238s,  or  231s  for  2048
tones.   The  sampling rate  is  xed  at  1/T
t
  9.142 MHz  (T
t
  =  10.9375 ns)  for  each  of  two
real sampling devices.
Code rates for DTV (see course web site for DTV standard document from ETSI with all the
details) and  there  is  a concatenated  coding  system  (see  Chapter  11).   Basically, used  tones
have  4QAM,   16QAM,   or   64QAM  transmission  with  various  code   rates   of   rst   code   rate
(172/204) second  code rate  1/2, 2/3, 3/4, 5/6, and 7/8, and produces  data rates  between
approximately 4.98 Mbps  and  31.67 Mbps  (selected  by  broadcaster,   along with  number  of
TV channels carried  from 2 to 8.   Note this system  at least doubles the number of channels
of analog TV, and often increases  by a factor of 8, enabling greater choice by consumer and
higher revenue by broadcaster.
4.6.3   An  alternative  view  of  DMT
If the input sequence  is periodic with period  N  samples, then the  corresponding output sequence  from
an FIR channel is also periodic.   The output of the channel may be computed by taking the IDFT of the
sequence  Y
n
 =  P
n
 X
n
  (where  P
n
  are the  samples of the  DFT of the  channel pulse response  samples).
When the cyclic prex is used, however, the input appears periodic as far as the last  N  samples of any
symbol, which are all the receiver processes.   DMT thus can also be viewed a circular convolution trick,
although this obscures  the optimality and connections to VC.
353
4.6.4   Noise-Equivalent  Channel  Interpretation  for  DMT
The  equivalence  of   an  additive  (colored)   Gaussian  noise  channel   to  an  AWGN  was  established  in
Section  1.7  and  used  in  Chapters  3  and  4.   The  colored  Gaussian  noise  generally  has  power  spectral
density
  A0
2
  R
n
(f).   The  channel   pulse  response  of  the  equivalent  AWGN  is  the  pulse  response  of  the
original channel adjusted by the inverse square-root autocorrelation of the Gaussian noise:
P(f) 
  P(f)
_
R
n
(f)
  .   (4.224)
The  receiver  uses  a  noise-whitening  ltering  1/
_
R
n
(f)  as  a  preprocessor   to  the  matched-lter  and
sampling device.   The  noise-equivalent viewpoint construes  this  whitening lter  as part of the channel,
thus altering the channel pulse response according to (4.224).
For DMT/OFDM, the noise equivalent view discussed earlier leads to a set of subchannels with signal
gains
  
c
x
]P(n/T)]
2
Rn(n/T)
  and  AWGN  variance
  A0
2
  ,  as  long  as  R
nn
  is  also  circulant.   In  practice,   R
n
(n/T)  is
not circulant unless  N ; however, for reasonable noises, the autocorrelation matrix can be close to
circulant for moderate values of  N.   A more common view, which avoids the use of the whitening lter,
is that each of the subchannels has gain
  
c
x
[P(n/T)[
2
and noise
  A0
2
  R
n
(n/T).   The set of SNRs remains
the same in either case, but the need for the whitening lter is eliminated in the second viewpoint.   The
second viewpoint has the minor aw that the noise variances per subchannel are not completely correct
unless  the  noise  n(t)  is  cyclostationary  with  period  T  or  NT
t
  is  large  with  respect  to  the  duration  of
the  noise  autocorrelation  function.   In  practice,   the  values  used  for   N  are  almost  always  more  than
suciently  large  that  implementations corresponding  to  the  second  viewpoint  are  ubiquitously  found.
Furthermore, noise power is measured as in Section 4.6.
4.6.5   Toeplitz  Distribution
The rigorous mathematical justication for convergence of vector coding to DMT derives from Toeplitz
Distribution arguments:
Theorem 4.6.1  (Toeplitz Distribution)  Given a square Toeplitz matrix R with rst row
[r
0
  r
1
  ... r
N1
], which extends  to  N  =  by allowing both ends  of the  rst  row to grow as
[... r
2
  r
1
  r
0
  r
1
  r
2
  ...] while maintaining Toeplitz structure, the Fourier Transform is dened
by
R(f) =
k=
r
k
e
2fk
.   (4.225)
Any  N-dimensional Toeplitz submatrix of  R along its diagonal has eigenvalues  
(N)
k
  .   Then
for any continuous function  g[]
lim
N
1
N
N
k=1
g[
(n)
k
  ] =
_
  .5
.5
g[R(f)]df   .   (4.226)
No  proof given:
Thus, for the square toeplitz matrix  PR
xx
P
b =
  1
N
N
n=1
.5 log
2
_
1 +
 c
n
[
n
[
2
A0
2
_
 
_
  .5
.5
.5 log
2
_
R
x
(f)[P(f)[
2
A0
2
_
  .df   (4.227)
This  expression  is  the  mutual  information  for  the  discrete-time  channel   with  sampling  rate  1/T
t
,   as
derived  in  Chapter  8.   This  integral  for  data  rate  in  bits  per  dimension maximizes  to  capacity,
 
b  =  c,
when a water-ll distribution is used for  R
x
(f),
R
x
(f) + [P(f)[
2
/
^
0
2
  =  constant   .   (4.228)
354
Figure 4.24:  The FEQ - frequency-domain equalizer.
So,  Toeplitz distribution  also relates  that  vector-coding  converges  (in  performance)  to  MT  and  to the
optimum transmission system when water-ll is used.   A caution to note is that the sampling rate 1/T
t
needs  to  be  selected  suciently high  for this  discrete-time  system  to  converge  to the  true  optimum of
capacity over all sampling rates.   When equivalent noise channels are used,
[P(f)[
2
  [P(f)[
2
A0
2
  R
n
(f)
  (4.229)
in the above integral.
4.6.6   Gain  Normalizer  Implementation  -  The  FEQ
For DMT and/or VC, the multichannel normalizer is often called an FEQ (Frequency Domain Equalizer),
which is shown in Figure 4.24.
A ZFEE  is applied to each complex DMT/OFDM subchannel  with target  X
n
  for each  subchannel.
From Section  3.7, with  the  matched  lter  absorbed  into the  equalizer  setting  W
n
, the  lter  is  a single
complex  multiply that  adjusts  the  gain  and  phase  of  each  subchannel   independently  to  minimize  the
mean-square error between  W
n
 Y
n
  and  X
n
.   The FEQ setting for the  n
th
subchannel of DMT is
W
n
 =
  1
P
n
.   (4.230)
The algorithm that converges to this setting as a function of time was provided earlier in Section 4.3 in
Equation (4.133).
Because it is a single tap and cannot change (unbiased) SNR, the FEQ simply scales the DFT outputs
so that the DMT/OFDM receiver outputs estimate  X
n
, and on each subchannel
SNR
ZFE,n
 = SNR
n
  .   (4.231)
With a common code class, (, and constant gap , a common decoder can be reused  to decode all sub-
channels.   The FEQ can be implemented adaptively as described in Section 3.7 just as more complicated
multiple tap equalizers are implemented.
4.6.7   Isakssons  Zipper
Often DMT partitioning is used on a bi-directional system.   That means both directions of transmission
share a common channel.   The signals may be separated by echo cancellation (a method of synthesizing
355
Figure 4.25:  Illustration of alignment problem in DMT with only cyclic prex.
356
Figure 4.26:  Use of cyclic sux to align DMT symbols at both ends of transmission.
from the  known transmitted  signal a replica of any reected  transmit energy  at  the  opposite-direction
signals co-located receiver and subtracting.  The signals may also be separated by assignment of indepen-
dent tone  sets  for each direction of transmission, known as frequency-division multiplexing.   Both echo
cancellation and frequency-division multiplexing have enormous simplication of their signal processing
if the DMT signals in the two directions have common sampling rates and symbol rates.   Alignment of
the symbol boundary at one end  is simple because  the choice of the transmitted symbol boundary can
be  made  to  align exactly  with the  received  DMT signals  symbol boundary.   However,  it  would appear
that alignment at the other end is then impossible with any realistic channel having delay.
This  basic problem  is illustrated in  Figure 4.25 for DMT systems  with  N  = 10,    = 2, and   = 3
(the delay of the channel).  LT and NT represent dierent ends of a transmission line at which each such
end has both a transmitter and a receiver.   The ds and us appearing in Figure 4.25 refer to downstream
(LT to NT) and upstream (NT to LT) transmission.   Alignment at the LT leads to misalignment at the
NT. Figure 4.26 shows a solution that adds a 6 sample cyclic sux after the DMT signal, repeating the
beginning of the  symbol (that  is the  beginning after the  cycli  prex).   There  are now 6 valid positions
for which a receiver  FFT to process the recieved signal without interference among tones.   One of those
6  positions  at  the  NT  downstream  is  the  same  position  as  one  of  the  6  positions  upstream,   allowing
alignment at the NT and LT simultaneously for the price of an additional 6 wasted samples.  The value 6
is clearly in general 2  .  As  N , the cyclic sux overhead penalty can be made negligible.  Figure
4.27 shows a method to reduce the smallest cyclic sux length to  by advancing the  LT  downstream
signal by  samples, allowing one of 3 valid downstream positions at the LT to corresponding to one of
357
Figure 4.27:   Use  of both cyclic  sux and timing advance  to align DMT signals at both ends  of trans-
mission.
3 valid upstream positions, and correspondingly one overlap also at the NT. This is the minimum cyclic
sux length necessary  for alignment at both ends simultaneously.  This method was introduced in 1997
by Michael Isaksson, then of Telia Research  in Sweden and is standardized  for use in what is known as
VDSL.
EXAMPLE  4.6.5  (VDSL)  VDSL is a recent ospring of ADSL that uses the cyclic sux
and  timing advance  of  Zipper.   VDSL, like  ADSL, is  baseband  real,  except  up  to  16  times
wider  bandwidth  than  ADSL  and  intended  for  use  on  shorter  phone  lines  where  speeds  as
high as 100 Mbps (more typically 10 Mbps) symmetric can be achieved with VDSL. In VDSL,
both upstream  and downstream  transmission may use  up to 4092 tones,  so  N  = 8192 with
congugate  symmetry.   The  symbol  period  remains  4000  Hz  and  the  tone  spacing  is  4.3125
kHz.   The cyclic prex is now 16 40 = 640 samples long.  The sampling rate is 35.328 MHz.
The cyclic prex period is actually shared with cyclic sux of minimum-length, determined
on a  per-line  basis  adaptively, and  DMT symbols  are  aligned at  both  ends.   The  upstream
and  downstream  tones  are  allocated  currently  via  frequency-division  mulitiplexing  (above
138 kHz, and can be both directions below that) with frequencies determined as best needed
as a function of line, channel, and service-provider  desires.   The  Zipper  duplexing leaves no
need  for  concern  with  complicated  analog lters  to  separate  up  and  down  transmissions  
this is entirely achieved digital as is sometimes also known as digital duplexing.  Somewhat
suprisingly also in retrospect, the digital duplexing allows simple power-minimizing-for-given-
rate/margin LC loading of this chapter as essentially optimum if run simultaneously on a set
of crosstalking telephone lines to nd a best trade in terms of spectrum among all the lines.
This is sometimes called iterative water-lling (see Chapter 12) and can lead to enormous
358
data rates on telephone lines.
4.6.8   Filtered  Multitone  (FMT)
Filtered Multi-Tone (FMT) approaches use excess bandwidth in the frequency domain (rather than or in
addition to excess  dimensions in a time-domain guard period) to reduce or eliminate intersymbol inter-
ference.   There have been many approaches to this FMT area under the names of wavelets (DWMT
,  W=wavelet), lter  banks  ,  orthogonal lters,  polyphase  by  various  authors.   This  subsection
pursues  a recent  approach, specically given the  name FMT, pioneered by Cherubini, Eleftheriou, and
Olcer of IBM. This latter FMT approach squarely addresses the issue of the channel impulse responses
(h(t)s)  usual destruction  of orthogonality that  was basically ignored in all the  other  approaches.   The
basic  idea in all approaches  though is  to nd  a discrete-time  realizable  implementation of a lter  that
basically approximates the GNC-satisfying brick-wall frequency-tone characteristic of the sinc(t) func-
tion of ideal multi-tone.   Essentially, realizable basis functions  with very  low frequency  content  outside
a  main  lobe  of  the  fourier  transform  (i.e.,  the  main  lobe  is  centered  at  the  carrier  frequency  of  the
tone  and  there  are  second,  third,  fourth,  etc.   lobes  that  decay  in  any realizable  approximation to the
ideal  band-pass  lter  characteristic  of  a  modualted  sinc  function).   Very  low  frequency  content  means
the  functions  after  passing through  any channel should  be  almost orthogonal at the  output.   This al-
most orthogonal unfortunately is the catch in most approaches, which ultimately fail on one channel or
another.   The FMT approach essentially provides a way of quantifying the amount of miss and working
to keep  it small on  any given channel.   There  will be  a  trade-o  similar to the  trade-o  of    versus  N
in terms of dimensionality lost in exchange for ensuring realizable and orthogonal basis functions, which
will be characterized  by essentially the excess bandwidth need in FMT.
The original modulated continuous-time waveform is again
x(t) =
j=
N1
n=0
X
n,j
  
n
(t  jT)   ,   (4.232)
where j is a symbol index.  This waveform in discrete construction is sampled at some sampling period T
t
where T  = (N+)T
t
, but there is no cyclic prex or time-domain guard period.  In the FMT approach, 
is sometimes set to zero (unlike DMT). The discrete-time FMT modulator implementation is illustrated
in Figure  4.28.   The  up-arrow  device  in Figure  4.28 is an  interpolator that  simply inserts  (N  +   1)
zeros between successive possibly nonzero values.  The down-arrow devices selects every (N+)
th
sample
from its input  and ignores  all others.   The  down-arrow devices  are  presumably  each  oset  successively
by  T
t
  sample periods  from one another.   In FMT approaches,  each  of the  discrete-time  basis functions
is a modulated version of a single low-pass function
n
(t) = (t)  e
2
T
 
N+
N
  nt
.   (4.233)
Each FMT tone when    > 0 has excess  bandwidth in that the tone spacing is (1 +
  
N
 )
 1
T
  like DMT but
without  cyclic  prex.   Said  excess  bandwidth  can  be  used  to  obtain  a  realizable  approximation to  an
ideal bandpass lter for all tones (subchannels).
A sampling time index is dened according to
m = k  (N + ) + i   i = 0, ..., N +   1   ,   (4.234)
where  i is an index of the sample oset within a symbol period and k measures as usual symbol instants.
The samples  x
m
 = x(mT
t
) of the continuous time waveform are then (with  
n
(mT
t
) = 
n,m
)
x
m
  =
j=
N1
n=0
X
n,j
  
n,mj(N+)
  (4.235)
=
j=
N1
n=0
X
n,j
  
n,k(N+)+ij(N+)
 i = 0, ..., N +  1   (4.236)
359
Figure 4.28:  FMT partitioning block diagram.
Each  of  the  N  +   basis  lters  is  also  indexed  by  the  sample  oset   i  and  essentially  decomposes  into
n,k(N+)+ij(N+)
 = 
(i)
n,kj
, each of which can be then implemented at the symbol rate (rather than
the original single lter at the sampling rate).   Further substitution of the expression in (4.233) nds
(i)
n,kj
  =   
(i)
kj
  e
2
N
  n[(kj)(N+)+i]
(4.237)
=   
(i)
kj
  e
2
N
  n[(kj)(N+)]
 e
2
N
  n[i]
.   (4.238)
Substitution of (4.238) into 4.236 provides for all  i = 0, ..., N +  1
x
k(N+)+i
  =
j=
N1
n=0
X
n,j
  
(i)
kj
  e
2
N
  n[(kj)(N+)]
 e
2
N
  n[i]
(4.239)
=
j=
(i)
kj
 
N1
n=0
X
n,j
  e
2
N
  n[(kj)(N+)]
 e
2
N
  n[i]
.   (4.240)
=
j=
(i)
kj
  x
i,j
 ([k  j])   (4.241)
where  x
i,j
 ([k  j])  is  the  N-point  inverse  discrete  Fourier  transform  of   X
n,j
  e
2
N
  n(kj)
in  the  i
th
sampling oset.
21
Nominally, this IDFT would need  to be recomputed  for every (k  j)  - however one
notes  that the  term  e
2
N
  n(kj)
that depends  on (k  j) is  simply a circular shift of the  time-domain
output by (k j)  samples.   Figure 4.29 illustrates the use of the IDFT (or IFFT for fast computation)
and  notes  that  the  lter  inputs  may  be  circularly  shifted  versions  of  previous  symbols  IDFT  outputs
when   ,=  0.   If  the  FIR  lter  
(i)
k
  has  length  2L + 1,  say  indexed  from  time  0  in  the  middle  to L,
then  previous  (and  future  - implement with  delay) would use  IDFT  output time-domain samples  that
correspond  to  (circular)  shifts  of  increasing  multiples  of     samples  as  the  deviate  from  center.   When
 = 0, no such shifting of the inputs from preceeding and succeeding symbols is necessary.   This shifting
is  simply a  mechanization  of  interpolation to  the  sampling rate  associated  with  the  excess  bandwidth
that is (1 +/N) times faster than with no excess  bandwidth.
21
Since i =  0, ..., N  +  1,  the samples  of  x
i,j
 ([k  j])  will  be  interpretted as  periodic in  i  with  period N.
360
Figure 4.29:  FMT partitioning using DFTs and symbol-rate lters.
parameter   DMT   FMT
f
  1+
  
N
T
1+
  
N
T
symbol length   N +  samples   N +  samples
cyc.   pre.   length      0
lost freq dimensions   0   
excess  bandwidth  
  
N
k
  
k
  lowpass lter
delay   N +    integer (N +)
implementation   N-point FFT   N-point FFT plus symbol-rate lter
performance   likes small- channels   likes sharp bandpass channels
Table 4.5:  Comparison of DMT and FMT
Clearly if   = 0 and  
k
 =  
k
, then FMT and DMT are  the same (assuming same 1/T  = 1/(NT
t
)).
However, the two systems dier in how excess bandwidth is realized (DMT in time-domain cyclic prex,
FMT in essentially wider tones than would otherwise be necessary without cyclic prexes by same ratio
/N).   FMT  also allows a shaping  lter  to  be  used  to try  to  approximately synthesize  ideal-multitone
bandpass subchannels.   DMT and FMT are compared in Table 4.5.
Any  FMT  system  will have  at  least  some  residual  intersymbol interference  and  interchannel  inter-
ference  that  is  hopefully  minimized  by  good  choice  of   
k
.   However,   equalizers  may  be  necessary  on
each subchannel and possibly between a few adjacent subchannels as shown in Figure 4.30.  Usually, the
functions  have such  low out-of-band energy  that only intersymbol interference  on each  subchannel  are
of  concern,   so  the  receiver  reduces  to  the  FFT  followed by  a  bank  of  symbol-rate  DFEs  as  shown  in
Figure 4.30.
Filter  Selection for FMT
There  are an enormous number  of possible choices  for the  function  (t) in FMT. Some authors like to
choose wavelet functions that have very low sidelobes but these often introduce intersymbol interference
while eliminating crosstalk.   While wavelets  may apply  nicely in  other  areas  of signal processing,  they
are  largely  ineective  in  data  transmission  because  they  ignore  the  channel.   Thus,   pursuit  of  a  large
361
Figure 4.30:  FMT receiver block diagram assuming negligible or no inter-channel interference.
number of wavelet functions is probably futile and a digression from this text.
Instead,  IBM  has  suggested  a couple  of functions  for DSL  applications that  appear  promising and
were designed with cognizance of the class of channels that are seen in those applications:
EXAMPLE  4.6.6  (Cherubinis Coaster)  Cherubinis   coaster   functions   use   no  excess
bandwidth  and  have    =  0.   The  coaster   is  actually  a  set   of   functions  parameterized  by
0 <  < 1 and the rst-order  IIR lowpass lter
(D) =
_
1 + 
2
  
  1 +D
1 +  D
  .   (4.242)
As   1, the functions become very sharp and the impulse response  
k
 is very long.  Figure
4.31 illustrates these  functions  for a   =  .1.   Note  the  rapid decay  with frequency, basically
ensuring in practice that the basis functions remain orthogonal at the channel output.   Note
also that there  will be intersymbol interference,  but as    1, this ISI becomes  smaller and
smaller  and  the  functions  approach  the  ideal  multitone  functions  (except  this  IIR  lter  is
stable  and  realizable).   An  DFE  is  typically  used  in  the  receiver  with  Cherubinis  coaster.
The hope is that there  is very little intersymbol or inter-channel interference  and  N  can be
reduced (because the issue of /N  excess bandwidth being small is inpertinent when   = 0).
The discrete-time response  is given by
k
 =
_
1 + 
2
  
_
k
 + (1  )()
k1
 u
k1
  .   (4.243)
In this gure, zero-order-hold (at interpolation) is used between  T
t
-spaced samples to pro-
vide a full Fourier transform of a continuous signal (such zero-order hold being characteristic
of the  behavior  of a DAC). This eect  causes  the  frequency  rippling in the  adjacent  bands
362
Figure 4.31:  Cherubinis Coaster basis functions in frequency domain with   = .1.
that is shown, but is quite low with respect  to what would occur  if  
k
 =  
k
  where  this rip-
pling would follow a  T  sinc(fT) shaping with rst side lobe down only 13.5 dB from the
peak as in Figure 4.38 of Section 4.8 instead of the 55 dB shown with Cherubinis Coaster.
The second example uses excess  bandwidth and raised-cosine shaping.
EXAMPLE  4.6.7  (Raised-Cosine with    > 0)  Raised cosine functions with excess band-
wisth  (/N)  can  also  be  used.   In  this  case  the  function  
(i+)
k
  can  be  simplied  from  the
square-root raised cosine in Section 3.2 to
(i)
k
  =
4
_
1 +
  N
T
t
  
cos
_
_
k +
  i
N
_
+
  1+
N
4[k+
  i
N
]
  sin
_
_
k +
  i
N
  N
N+
_
1  16 
_
k +
  i
N
2
  .   (4.244)
Warning - expression above rst-time evaluated by professor and not checked by plotting or
otherwise.
This function with suciently large number of taps in FIR approximation (for given excess
bandwidth,   smaller   /N  means  longer  FIR  lter  in  approximation)  can  be  made  to  have
suciently  low  out-of-band energy  and  clearly  avoids  intersymbol  interference  if  the  tones
are suciently narrow.
363
4.7   Multichannel  Channel  Identication
Generally, channel identication methods measure the channel pulse response  and noise power spectral
density, or their equivalents (that is direct measurement of SNRs without measuring pulse response and
noise separately).   Multi-channel channel identication directly estimates signal and noise parameters for
each of the subchannels.   Many channel identication methods exist, but only a DMT-specic method for
channel identication appears in this Chapter.   Other methods are discussed in Chapter 7.  The method of
this section separately identies rst the pulse-response DFT values at the subchannel center frequencies
used by a DMT system and then second the noise-sample variances at these same frequencies.   Channel
identication then nishes by computing the SNRs from the results of gain and noise estimation.   The
measurement  of  the  pulse-response  DFT  values  is  called  gain  estimation
22
,   while  the  measurement
of the  noise  variances is  called noise  estimation.   This sections  particular method  is in common use
because  it  reuses  the  same  hardware/software  as  steady-state  transmission  with  DMT;  however,   it  is
neither optimum nor necessarily  a good choice in terms of performance for limited data observation.   It
is easy to implement.   Fortunately, a lengthy training interval is often permitted for DMT-transmission
applications,   and  this  method  will  become  very  accurate  with  sucient   observation.   The  presented
method is usually not appropriate for wireless OFDM systems.
The channel noise is an undesired disturbance in gain estimation, and this noise needs to be averaged
in determining subchannel gains as in Subsection 4.7.1.  When gain estimation is complete, the estimated
(noiseless) channel output can be subtracted from the actual channel output to form an estimate of the
channel  noise.   The  spectrum  of  this  channel  noise  can  then  be  estimated  as  in  Section  4.7.2.   In  gain
estimation,  care  is  taken  to  ensure  that  any  residual  gain-estimation error  is  suciently  small  that  it
does  not signicantly corrupt  the noise-spectral-estimation  process.   Such  small error  can require  large
dynamic range in estimating the subchannel gains.
Equation (4.139) shows that the channel SNR g
n
 can be computed from the known subchannel input
distance,  d
n
, the known multichannel normalized output distance  d, and the noise estimate   
2
n
.   During
training, it is possible to send the same constellation on all subchannels, thus essentially allowing d
n
 = d,
and thus making
g
n
 =
  1
 
2
n
.   (4.245)
Two questions were not addressed, however, by Subsection ??, which are invidually the gain constants  
and  
t
  for the updating loops for the multichannel normalizer  W
n
  in (4.133) and for the noise-variance
estimation in (4.137), respectively.   These values will be provided shortly, but a more direct consideration
of computation of g
n
 will be considered for the special case of training rst.   The choice of  is an example
of the general problem of gain estimation in Subsection 4.7.1, which is the estimation of the subchannel
signal  gain.   The  choice  of  the  second  gain  constant  
t
  is  an  example  of  the  general  problem  of  noise
estimation in Subsection 4.7.2, which is the estimation of the unscaled subchannel noise.
4.7.1   Gain  Estimation
Figure  4.32  illustrates  gain  estimation.   The  known  training  sequence,   x
k
,   is  periodic  with  period  N
equal to or greater than the number of coecients in the unknown channel pulse response  p
k
 (periodic
can mean use of the cyclic prex of Section 4.4).  The channel output sequence  is
y
k
 = x
k
 p
k
 +u
k
  (4.246)
where u
k
 is an additive noise signal assumed to be uncorrelated with x
k
.  The channel distortion sequence
is denoted by  u
k
, instead of n
k
, to avoid confusion of the frequency index n with the noise sequence and
because practical multichannel implementations may have some residual signal elements in the distortion
sequence  u
k
  (even though independence  from  x
k
  is assumed for simplication of mathematics).
Gain estimation constructs  an estimate  of the  channel pulse  response,    p
k
, by minimizing the  mean
square of the error
e
k
 = y
k
   p
k
  x
k
  .   (4.247)
22
This  gain  is  usually a  complex gain  meaning that both  subchannel gain  and phase information are estimated.
364
Figure 4.32:  Channel-identication basics and terminology.
Ideally,   p
k
 = p
k
.
The Discrete Fourier Transform, DFT, of  x
k
  is
X
n
 =
  1
N
N1
k=0
x
k
 e
2
N
  kn
.   (4.248)
When  x
k
  is  periodic
23
with  period  N  samples,  the  DFT  samples  X
n
  are  also periodic  with  period  N
and constitute  a complete representation  for the periodic time sequence  x
k
  over all time.   The  N-point
DFTs of  y
k
,  u
k
,   p
k
,  p
k
, and  e
k
  are  Y
n
,  U
n
,
  
P
n
,  P
n
, and  E
n
, respectively.
Since  x
k
  is periodic, then
Y
n
 = P
n
 X
n
 +U
n
  .   (4.249)
The corresponding frequency-domain error is then
E
n
 = Y
n
  
P
n
 X
n
  n = 0, ..., N   .   (4.250)
(Recall that  subchannels  0  and  N  are  one-dimensional in this  development.)   A  vector  of  samples will
contain one period of samples of any of the periodic sequences  in question, for instance
x =
_
_
x
N1
.
.
.
x
1
x
0
_
_
  .   (4.251)
Then the relations
X = Q
x   x = QX   (4.252)
23
The periodicity can also be emulated by a cyclic prex of length   samples at the beginning of each transmitted block.
365
follow easily, where  Q
2
N
  ki
) and QQ
= Q
Q =
I.   Similarly, y,  Y ,   e,  E,  p,  P,   p,
  
P,  u  and  U  as  corresponding  time- and frequency-domain vectors.
For  the  case  of  the  non  periodic  sequences   u
k
  and  thus  y
k
  and  e
k
,  a  block  time  index  l   dierentiates
between dierent channel-output blocks (symbols) of  N  samples, i.e.   y
l
.
The MSE for any estimate  p is then
MSE   =   E
_
[e
k
[
2
_
  (4.253)
=
  1
N
 E
_
|e|
2
_
 =
  1
N
N1
k=0
E
_
[e
k
[
2
_
  (4.254)
=
  1
N
 E
_
|E|
2
_
 =
  1
N
N
n=0
E
_
[E
n
[
2
_
  .   (4.255)
The  tap  error  is  
k
=  p
k
   p
k
  in  the  time  domain and  
n
=  P
n
 
  
P
n
  in  the  frequency  domain.   The
autocorrelation sequence  for any discrete-time sequence  x
k
  is dened by
r
xx,k
 = E
_
x
m
x
mk
_
  (4.256)
with corresponding discrete  power spectrum
R
xx,n
 = Q r
xx,k
  .   (4.257)
The norm tap error (NTE) is dened as
NTE
  
=   E
_
|p   p|
2
_
= E
_
||
2
_
  (4.258)
=   E
_
|P 
  
P|
2
_
= E
_
||
2
_
  (4.259)
=
N
n=0
[P
n
  
P
n
[
2
.   (4.260)
When the estimate   p equals the channel pulse response  p, then
e
k
 = u
k
  (4.261)
and
r
ee,k
= E
_
e
m
e
mk
 = r
uu,k
   k = 0, ..., N 1   ,   (4.262)
where  r
ee,k
  is  the  autocorrelation  function  of  e
k
,  and  r
uu,k
  is  the  autocorrelation  function  of   u
k
.   The
MSE is then, ideally,
MSE = r
ee,0
 = 
2
u
  ,   (4.263)
the  mean-square  value  of   the  channel   noise  samples.   
2
u
  =
  A0
2
  for  the  AWGN  channel.   In  practice,
 p
k
 ,= p
k
  and the excess  MSE is
EMSE
  
= MSE 
2
u
  .   (4.264)
The overall channel SNR is
SNR =
  r
xx,0
|p|
2
r
uu,0
(4.265)
and the SNR of the  n
th
channel is
SNR
n
 =
  R
xx,n
[P
n
[
2
R
uu,n
.   (4.266)
The  gain-estimation SNR  is  a  measure  of  the  true  channel-output  signal  power  on  any  subchannel  to
the excess  MSE in that same channel (which is caused by  p ,= p), and is
=
  R
xx,n
[P
n
[
2
R
ee,n
 R
uu,n
.   (4.267)
366
EXAMPLE  4.7.1  (Accuracy)   
n
  should be  30-60 dB  for  good gain estimation.   To un-
derstand  this requirement  on  
n
, consider  the  case  on a subchannel  in DMT with SNR
n
 =
Rxx,n]Pn]
2
Ruu,n
= 44 dB, permitting 2048 QAM to be transmitted on this subchannel.   The noise
is then 44 dB below the signal on this subchannel.
To estimate this noise power  to within .1 dB of its true value, any residual gain estimation
error  on this subchannel,   R
ee,n
 R
uu,n
, should be  well below this mean-square  noise level.
That is
R
ee,n
 < 10
.1/10
 R
uu,n
approx(1 + 1/40)  R
uu,n
  (4.268)
Then,
n
 =
  R
xx,n
[P
n
[
2
R
ee,n
 R
uu,n
=
  SNR
n
(1/40)
  = 40  snr   (4.269)
is  then  60 dB  (=  44+16 dB).  Similarly, on  a comparatively weak  subchannel  for which  we
have SNR
i
=14 dB for 4-point QAM, one would need  
n
  of at least 30 dB (= 14+16).
In general, for x dB accuracy
n
 
_
10
x/10
1
_
1
 SNR
n
  .   (4.270)
An  Accurate Gain Estimation method
A particularly accurate  method for gain estimation uses  a cyclically prexed  training sequence  x
k
  with
period,  N, equal to or slightly longer than  , the length of the equalized channel pulse response.   Typi-
cally, N  is the same as in the DMT transmission system.   The receiver  measures (and possibly averages
over  the  last  L  of   L + 1  cycles)  the  corresponding  channel  output,   and  then  divides  the  DFT  of  the
channel output by the DFT of the known training sequence.
The channel estimate in the frequency domain is
P
n
 =
  1
L
L
l=1
Y
l,n
X
l,n
.   (4.271)
Performance Analysis   The output of the n
th
subchannel on the  l
th
symbol produced by the receiver
DFT is  Y
l,n
 = P
n
X
l,n
 +U
l,n
.   The estimated value for
  
P
n
 is
P
n
 = P
n
 +
L
l=1
U
l,n
L  X
l,n
,   (4.272)
so
n
 = 
L
l=1
U
l,n
L  X
l,n
.   (4.273)
By selecting the training sequence,  X
l,n
, with constant magnitude, the training sequence becomes X
n
 =
[X[e
l,n
.
The error signal on this subchannel after
  
P
n
 has been computed is
E
n
  =   Y
n
  
P
n
X
n
 = 
n
X
n
 +U
n
  (4.274)
=   U
n
 +
  1
L
L
l=1
U
l,n
e
(nl,n)
.   (4.275)
The  phase  term  on  the  noise  will not  contribute  to  its  mean-square  value.   U
n
  corresponds  to  a  block
that is not used in training, so that  U
n
  and each of  U
l,n
  are independent, and
E
_
[E
n
[
2
_
 = R
ee,n
 = R
uu,n
 +
  1
L
R
uu,n
 = (1 +
  1
L
)R
uu,n
  .   (4.276)
367
The excess MSE on the  i
th
subchannel then has variance (1/  L)R
uu,n
 that is reduced by a factor equal
to the the number of symbol lengths that t into the entire training interval length, with respect  to the
mean-square noise in that subchannel.   Thus the gain-estimation SNR is
n
 = L 
  R
xx,n
[P
n
[
2
R
uu,n
= L  SNR
n
  ,   (4.277)
which means that the performance of the recommended gain estimation method improves linearly with
L on all subchannels.   Thus, while noise may be relatively higher (low SNR
n
) on some subchannels than
on others, the  extra noise caused  by misestimation is a constant factor below the relative signal power
on that same subchannel.   In general then for x dB of accuracy
L 
_
10
x/10
 1
_
1
.   (4.278)
For an improvement in gamma 16 dB, as determined  earlier in an  example provides .1 dB  error  or
less no matter what the subchannel SNR
n
, gain estimation requires  L = 40 symbols of training.
For instance, the previous ADSL and VDSL examples of Section 4.3 used symbol periods  of 250s.
Thus,  gain estimation to within .1 dB  accuracy  is possible  (on average)  within 10ms using this simple
method.
Windowing   When  the  FFT  size  is  much  larger  than  the  number  of   signicant  channel   taps,   as  is
usually  the  case,   it  is  possible  to  signicantly  shorten  the  training  period.   This  is  accomplished  by
transforming (DFT) the nal channel estimate to the time domain, windowing to +1 samples and then
returning to the frequency domain.  Essentially, all the noise on sample times   + 2...N  is thus removed
without disturbing the  channel estimate.   The noise  on all estimates,  or equivalently the  residual error
in the  SNR, improves by the  factor  N/  with windowing.   For  the  ADSL/VDSL example  again, if the
channel   were  limited  to  32  nonzero  successive  samples  in  duration,   then  windowing  would  provide  a
factor  of  (32/512)  reduction  or  12  dB  of  improvement,  possibly  allowing just  a  few  DMT  symbols  to
have an (on average) accuracy of .1 dB.
Suggested Training  Sequence
One training sequence  for the gain estimation phase is a random or white sequence
24
L blocks are used
to  create
  
P
n
,   which  is  computed  recursively  for  each  of   the  subchannels,   one  symbol   at  a  time,   by
accumulation, and then divided by  L.   The L blocks can be periodic repeats or cyclic prex can be used
to  make the  training sequence  appear  periodic.   For reasons  of  small unidentied  channel  aberrations,
like small nonlinearities caused by drivers, converters, ampliers, it is better to use a cyclic prex because
any such aberrations will appear in the noise estimation to be subsequently described.
In complex baseband channels a chirp training sequence  sometimes appears convenient
x
k
 = e
2
N
  k
2
.   (4.279)
This chirp  has a  the  theoretical minimum peak-to-average (one-dimensional) power  ratio of 2 and is a
white sequence,  with  r
xx,k
 =  
k
, with  
k
  being the Kronecker  delta (not the  norm tap error).   While
white is good for identifying an unknown channel, a chirps low PAR can leave certain small nonlinear
noise distortion unidentied, which ultimately would cause the identied SNR
n
s to be too optimistic.
The value for  in the FEQ   The EMSE for the rst-order zero-forcing update algorithm in Equation
(4.133) can be found with some algebra to be
EMSE =
    c
n
2   c
n
R
uu,n
  .   (4.280)
24
Say  the  PRBS  of  Chapter  10  with  polynomial 1 + D + D
16
)  with  bn  =  2  on  channels n  =  1, ..., N  1  and  bn  =  1  on
the DC  and Nyquist subchannels.  One needs to ensure that the seed for the PRBS  is such that no peaks occur that would
exceed  the  dynamic  range  of  channel elements  during  gain  estimation.   Such  peaks  are  usually  compensated by  a  variety
of  other mechanisms when  they occur in  normal  transmission (rarely) and would  cause unnecessarily conservative loading
if  their  eects were  allowed  to  dominate measured SNRs.
368
Thus,  for the  multichannel normalizer  to maintain the  same  accuracy  as the  initial gain estimation in
Equation (4.276)
1
L
  =
    c
n
2   c
n
(4.281)
or
 =
  2
(L + 1)  c
n
,   (4.282)
so for  L = 40 and c
n
 = 1, the value would be   = 1/42.
4.7.2   Noise  Spectral  Estimation
It is also necessary to estimate the discrete noise power spectrum
25
, or equivalently the mean-square noise
at each of the demodulator DFT outputs, computation of the SNRs necessary  for the bit-distribution.
Noise estimation computes a spectral estimate of the error sequence,  E
n
, that remains after the channel
response has been removed.  The training sequence used for gain estimation is usually continued for noise
estimation.
Analysis  of Variance
The variance of  L samples of noise on the  n
th
subchannel is:
2
n
 =
  1
L
L
l=1
[E
l,n
[
2
.   (4.283)
For stationary noise on any tone,  the mean of this expression  is the  variance of the (zero-mean)  noise.
The variance of this variance estimate is computed as
var
_
 
2
n
_
=
  1
L
2
_
3  L  
4
n
L  (
2
n
)
2
_
=
  2
L
4
n
  (4.284)
for  Gaussian noise.   Thus  the  excess  noise  in the  estimate  of  the  noise  is  therefore
 _
2/L  
2
n
  and  the
estimate  noise  decreases   with  the  square-root   of  the  number  of   averaged  noise  terms.   To  make  this
extra  noise  .1  dB  or  less,  we  need  L  =  3200.   Recalling that  the  actual  computed  value  for  the  noise
estimate  will  have  a  standard  deviation  on  average  that  leads  to  less  than  .1  dB  error,   there  is  still
a  10%  chance  that  the  statistics  of  the  noise  will lead  to  deviation  greater  than  1  standard  deviation
(assuming Gaussian distribution, or approximating the more exact Chi-square with a Gaussian).  Thus,
sometimes a yet larger value of  L is used.   98% condence  occurs  at  L = 12, 800 and 99.9% condence
at  L = 28, 800.
EXAMPLE  4.7.2  (ADSL)  The  ADSL  example  with  symbol   rate  of   4000  Hz  has  been
discussed  previously.   The training segment for SNR estimation in G.992.1 initially is called
Medly and consists of 16000 symbols (with QPSK on all subchannels) for 4 seconds of train-
ing.   This exceeds  the  amount given as a guideline above - this extra  time is provided  with
the cognizance that the standard deviation of the measured noise and channel should lead to
no more than .1 dB error  with a smaller training interval, but that there  are still statistical
deviations with real channels.   Essentially, with this increased  value, about 99% of the  time
the computed value will deviate by this amount of .1 dB or less.   The gain estimation error
clearly can be driven to a negligible value by picking L to a few hundred without signicantly
aecting training time.   (The long training interval also provides time for SNR computation
and loading algorithms after gains and noises are estimated.)
25
Clearly the noise can not have been whitened if it is unknown and so this step precedes any noise whitening that might
occur with  later  data transmission.
369
The  value  for  
t
  in  the  FEQ   The  standard  deviation for the  rst-order  noise  update  in the  FEQ
can be found to be
std(
 
2
n
) =
  
t
2  
t
  
2
n
  .   (4.285)
Thus, to maintain a constant noise accuracy, the step size should satisfy
2
L
  =
  
t
2  
t
  (4.286)
or
t
 =
  4
L + 2
  (4.287)
so when  L = 6400, then  
t
  6.25  10
4
.   In reality, the designer might want high condence  (not just
an average condence)  that the accuracy is .1 dB, in which case a   of perhaps 10
4
would suggest the
chance of missing would be the same as noise occuring with 4 times the standard deviation (pretty small
in Gaussian and Xi-square distributions).
4.7.3   SNR  Computation
The signal to noise ratio is then estimated for each subchannel independently as
SNR
n
 = c
n
[
P
n
[
2
 
2
n
.   (4.288)
Equivalently,  using  the  FEQ,   the  SNR  is  given  by  (4.245).   Loading  can  be  executed  in  the  receiver
(or  after  passing  SNR  values,   in  the  transmitter)  and  then  the  resulting  b
n
  and c
n
  (or  relatively  in-
crease/decrease  in c
n
 with respect to what was used last) are sent to the transmitter via the reverse link
always necessary  in  DMT. This  step  is  often  called  the  exchange  in  DMT  modems  and  occurs  as  a
last step prior to live transmission of data, often called showtime!
370
4.8   Stationary  Equalization  for  Finite-length  Partitioning
A potential problem in the implementation of DMT/OFDM and VC channel-partitioning methods is the
use of the cyclic prex or guard period of an extra  samples, where +1 is the length in sampling periods
of the channel pulse response.   The required excess-bandwidth is then /N.  On many practical channels,
  can be large - values of several hundred can occur  in both wireless and wired transmission problems,
especially when channels with narrowband noise are considered.   To minimize the excess  bandwidth,  N
needs to be very large, potentially 10,000 samples or more.  Complexity is still minimal with DMT/OFDM
methods  with  large  N,   when  measured  in  terms  of  operation  per  unit  time  interval.   However,   large
N  implies   large  memory  requirement   (to  store   the   bit   tables,   energy  tables,   FEQ  coecients,   and
intermediate FFT/IFFT results), often dominating the implementation.  Further, large N  implies longer
latency  (delay  from  transmitter   input  to  receiver   output)   in  processing.   Long  latency  complicates
synchronization  and  can  disrupt  certain  higher-level  data  protocols  used  and/or  with  certain  types  of
carried applications signals (like voice signals).
One  solution  to  the  latency  problem,   invented  and  rst   studied  by  J.   Chow  (not  to  be  confused
with the P. Chow of Loading Algorithms), is to use a combination of short-length xed (i.e.  stationary)
equalization and DMT/OFDM (or VC) to  reduce  , and thereby  the  required  N, to reach  the  highest
performance  levels with  less  memory and  less  latency.   The  equalizer  used  is known as  a  time  equal-
izer  (TEQ), and  is  studied  in this  section.   Chows basic  MMSE  approach  is  shown  here  to  result  in
an  eigenvector-solution  problem  that  can  also  be  cast  in  many  other  generalized  eigenvector  forms.
Various  authors  have  studied  the  TEQ  since  J.  Chow,  and  have  claimed  improvements:   Actually, the
basic  theory  developed  by  Chow  is  better  than  any  of  the  subsequent  generalized-eigenvector  work
26
.
Al-Dhahir and Evans have developed yet better theories for the TEQ, but the solution will not be easily
obtained in closed form and requires advanced numerical optimization methods for non-convex optimiza-
tion   Chows TEQ  usually performs  very  close  to those  methods.   Essentially these  methods  attempt
to maximize the overall SNR
dmt
 by optimizing equalizer settings and energy/bit distributions jointly.  A
simpler approach using cyclic reconstruction appears later in this chapter.
Subsection 4.8.1 studies the general data-aided equalization problem with innite-length equalizers.
The TEQ is there seen to be a generalization of the feedforward section of a DFE. Subsection 4.8.2 then
reviews performance calculation of the nite-length TEQ for a target    that is potentially less than the
channel   pulse-response  length.   Subsection  4.8.3  illustrates  the  zero-forcing  white-input  specialization
of  Chows  TEQ,   which  is  found  to  in  that  special   lower-performance  situation  is  equivalent  to  many
developed  by  Melsa  and  Rohrs,  by  Gatherer,  by  Bingham, by  Pal, and  by  Ilani using  what  are  called
windows and walls (or alternatively a series  of projections).   An example introduced  in Subsection
4.8.2 and revisited in Subsection 4.8.3 illustrates why the latter approaches might be avoided.  Subsection
4.8.5 investigates the joint loading/equalizer design problem and then suggests a solution that could be
implemented.
The very stationarity of the TEQ renders it suboptimum because the DMT/OFDM or VC system is
inherently cyclostationary with period  N + samples.   There are 3 simplications in the development
of the TEQ that are disturbing:
1.   The equalizer is stationary and should instead be periodic with period  N +.
2.   The input sequence  x
k
 will be assumed to be white or i.i.d.  in the analysis ignoring any loading
that might have changed its autocorrelation.
3.   A scalar mean-square error is minimized rather than the multichannel SNR SNR
m,u
.
Nonetheless,   TEQs   are   heavily  used  and  often  very  eective   in  bringing  performance   very  close   to
the  innite-length multitone/modal-modulation optimum.   An  optimum equalizer  is cyclostationary or
periodic and found in Chapter 5, but necessarily more complex.   When such an cyclic-equalizer is used,
then  problems  2  and  3  are  eliminated.   If  the  stationary  xed  equalizer  is  used,   Al-Dhahir and  Evans
have considered solutions to problems 2 and 3, and also a newer solution is posed in Subsection 4.8.5.
26
Most of the latter generalized-eigenvector work focuses its criticism on an algorithm given by Chow to be implemented
in  the frequency-domain, and  thus fails to  observe that his  early theory that the  algorithm tried  to approximate in  actual
implementation.  His implementation algorithmto approximate the theory is not the best to use today, but the basic theory
still  provides a  better equalizer.
371
Figure 4.33:  The Maximum Likelihood Innite-Length Equalization problem.
4.8.1   The  innite-length  TEQ
The  general  baseband  complex decision-aided, or in our  study  input-aided, equalization problem ap-
peared in Figure 4.33.  The channel output  y
p
(t) is
y
p
(t) =
 
k
x
k
 p(t  kT) + n
p
(t)   .   (4.289)
The  channel  output  signal  y
p
(t)  is  convolved with  a matched  (or  anti-alias) lter  
p
(t)  (where  a  su-
perscript  of   denotes  conjugate),   sampled  at  rate  1/T,   and  then  ltered  by  a  linear  equalizer.   The
matched-lter  output  is  information  lossless.   When  the  equalizer  is  invertible,   its  output  is  also  in-
formation lossless  with respect  to  the  channel  input.   Thus,  a  maximum-likelihood detector  applied to
the  signal at  the  output  of the  equalizer  w
k
  in  Figure  4.33 is  equivalent in  performance  to  the  overall
maximum-likelihood detector for the channel.
Heuristically,   the  function  of   the  linear  equalizer  is  to  make  the  equalizer  output  appear  like  the
output of the second lter, which is the convolution of the desired channel shaping function  b
k
  and the
known channel  input  x
k
.   The  target  b
k
  has  a length   + 1  samples that  is usually much less  than the
length  of  the  channel  pulse  response.   The  minimum mean-square  error  between  the  equalizer  output
and the desired  channel shaping is a good measure  of equalizer performance.   However, the implication
that  forcing  the  channels  response  to  fall  mainly  in   + 1  sample  periods  only  loosely  suggests  that
the  geometric-SNR  for  the  set   of   parallel   subchannels   will   be  maximized.   Also,   an  imposition  of   a
xed  intersymbol interference  pattern,  rather  than  one  that  varies  throughout a  symbol period  is  also
questionable.  Nonetheless, this method will be eective in a large number of practical situations.  Further
development requires some review and denition of mathematical concepts related to the channel.
Minimum Mean-Square Error Equalization
The error sequence in Figure 4.33 is
e
k
 = b
k
 x
k
 w
k
  y
k
  (4.290)
where   denotes  convolution.   The  signal  is  dened  as  the  sequence   b
k
  x
k
,  which  has  signal  energy
c
s
 = |b|
2
c
x
.  The D-Transformof a sequence x
k
 remains X(D)
  
=
 
k
x
k
D
k
.   The notation x
(D
) =
k
x
k
D
k
represents  the time-reversed conjugate of the sequence.   Thus,
E(D) = B(D)X(D) W(D)Y (D)   .   (4.291)
The minimum mean-square error is
2
MMSE
 =  min
wk  , bk
E
_
[e
k
[
2
  .   (4.292)
372
A  trivial  solution  is  W(D)  =  B(D)  =  0  with  
2
MMSE
  =  0.   Aversion  of  the  trivial  solution  requires
positive  signal power  so  that c
s
  >  0  or,  equivalently, |b|
2
=   constant   = |p|
2
>  0.   For  at  white
excitation (i.e., S
x
(D) =
  
c
x
) of the channel dened by  B(D),
SNR
MFB
 =
  c
s
2
MMSE
.   (4.293)
When |b|
2
= [|p|
2
> 0 is constrained constant, SNR
MFB
 is independent of |b|
2
.  Maximizing SNR
MFB
is then equivalent to minimizing 
2
MMSE
.  While this problem formulation is the same as the formulation
of decision feedback equalization, the TEQ does not restrict  B(D) to be causal nor monic.
The orthogonality principle of MMSE estimation states that the error sequence  values  e
k
  should be
orthogonal  to  the  inputs  of  estimation,  some  of  which  are  the  samples  y
k
.   Thus,   E[e
k
y
l
 ]  =  0   k, l,
which is compactly written
R
ey
(D)   =   0   (4.294)
=   E
_
E(D)  Y
(D
  (4.295)
=   B(D) 
  
R
xy
(D)  W(D) 
  
R
yy
(D)   .   (4.296)
then,
W(D) = B(D) 
  R
xy
(D)
R
yy
(D)
  .   (4.297)
Then,
Y (D) = |p|Q(D)X(D) + n(D)   ,   (4.298)
where  Q(D)
  
=
  1
2|p|
2
_
n
[P
_
2n
T
_
[
2
d.   Then,
R
xy
(D)   =
  
c
x
  |p|  Q(D)   (4.299)
R
yy
(D)   =
  
c
x
  |p|
2
 Q(D) 
_
Q(D) +
  1
SNR
MFB
_
  .   (4.300)
Recalling the canonical factorization from Chapter 3 DFE design,
Q(D) +
  1
SNR
MFB
= 
0
 G(D)  G
(D
)   .   (4.301)
This factorization is sometimes called the key equation.
The error autocorrelation sequence  is
R
ee
(D) = B(D)
_
R
xx
(D) 
  R
xy
(D)
R
yy
(D)
R
yx
(D)
_
B
(D
)   ,   (4.302)
where the reader  may recognize the autocorrelation function of the MMSE linear equalizer error  in the
center of the above expression.   We simplify this expression further using the key equation to obtain
R
ee
(D) =
_
  
c
x
0
  SNR
MFB
_
B(D) B
(D
)
G(D)  G
(D
)
  .   (4.303)
Dening the (squared) norm of a sequence  as
|x|
2
  
=
k=
[x
k
[
2
,   (4.304)
the mean square error from (4.303) is
MSE =
_
  
c
x
0
 SNR
MFB
_
|
b
g
|
2
.   (4.305)
373
From the Cauchy-Schwarz lemma, we know
|b|
2
= |
b
g
  g|
2
< |
b
g
|
2
|g|
2
(4.306)
or
|
b
g
|
2
>
 |b|
2
|g|
2
  .   (4.307)
Thus, the MMSE is
2
MMSE
 =
_
  
c
x
0
 SNR
MFB
_
  ,   (4.308)
which can be  achieved by a multitude of  B(D)  if the degree  of  G(D)  is less  than or equal to  
27
, only
one of which is causal and monic.
The MMSE solution has any  B(D)  that satises
B(D)  B
(D
) = G(D) G
(D
)   .   (4.309)
From (4.297) to (4.301), one nds  W(D) from  B(D) as
W(D) =
  1
0
 |p|
B(D)
G(D)  G
(D
)
  ,   (4.310)
which  is  always invertible  when
  A0
2
  >  0.   The  various  solutions  dier  in  phase  only.   The  alert  reader
may note that such a solution is biased and the equalized channel is not exactly B(D) = b
i  D
i
+... +
b
i+
 D
i+
and in fact will be reduced by by the scale factor w
0
/b
i
  1.  Such a bias can be removed
by the inverse  of this factor without changing performance  and the  channel will then  be  exactly  B(D)
with an exactly compensating increase in error variance.
Solutions
As mentioned earlier, there are many choices for B(D), each providing then a corresponding TEQ W(D)
given by Equation (4.310).
The   MMSE-DFE   The  Minimum-Mean-Square  Error   Decision  Feedback  Equalizer   (MMSE-DFE)
chooses causal monic  B(D) as
B(D) = G(D)   so   W(D) =
  1
0
  |p|  G
(D
)
  ,   (4.311)
is anti-causal.
In ML detection  that is DMT or VC detection  with the MMSE-DFE choice of TEQ lter  W(D),
the   feedback  lter   B(D)   is   never   implemented  avoiding  any  issue   of   error   propagation  or   need  for
precoders  and  their  correspodning  loss.   That  ML  detector  is  easily  implemented  in  the  case  of  DMT
or  VC (but  requires  exponentially growing complexity, see  the  Viterbi  MLSD method  of Chapter  9,  if
PAM/QAM is used).
The  MMSE-AR  Equalizer   The  MMSE  Auto  Regressive  Equalizer  (MMSE-ARE)  corresponds  to
the anti-causal monic  B(D) choice
B(D) = G
(D
)   so   W(D) =
  1
0
  |p|  G(D)
  ,   (4.312)
is causal.  This is another possible solution and the MMSE level and the SNR
MFB
 are the same as all the
other optimum solutions, including the MMSE-DFE. One would not make the choice in (4.312) if sample-
by-sample decisions are of interest because  neither future decisions are available in the receiver  nor can
a  Tomlinson precoder  be  implemented  (properly)  with  a  noncausal   B(D).   However,   such  causality  is
not an issue with the TEQ and ML detection via either DMT or VC.
27
If  the  degree is  greater than  ,  then the  methods of  Section 4.8.2 must  instead be  used  directly.
374
Comparison of MMSE-DFE  and MMSE-ARE
Both of the MMSE-DFE and the MMSE-ARE equalizer have the same performance (when ML detection
is used).   Essentially the causality/noncausality conditions of the feedforward lters and feedback lters
have  been  interchanged.   Because   B(D)   is  never   used  in  the  receiver   (only  for  our  analysis),   either
structure  could  be  used  in  a  practical  situation.   A  multicarrier  modulation method  used  with  W(D)
equalization would also perform the same for either choice of  W(D) (or any other optimum choice).
Thus,  it  would  appear  that  since  all structures  are  equivalent,  we  might  as  well  continue  with  the
use  of the  MMSE-DFE,  since  it is  a well-known receiver  structure.   However,  when  nite-length lters
are used, this can be a very poor choice in attempting to maximize performance for a given complexity
of  implementation.   As  a  trivial example,  consider  a  channel  that  is  maximum phase,  the  feedforward
lter   for  a  MMSE-DFE,   in  combination  with  the  matched  lter,   will   try  to  convert   the  channel   to
minimum-phase (at the output of W(D), leading to a long feedforward section when (as in practice) the
matched-lter and feedforward lter are implemented in combination in a fractionally spaced equalizer.
On  the  other  hand,   the  MMSE-ARE  equalizer  feedforward  section  (combined  with  matched  lter)  is
essentially (modulo an anti-alias lter before sampling) one nonzero tap that adjusts gain.  The opposite
would be true for a minimum-phase channel.
In data transmission, especially on cables or over wireless transmission paths, the channel character-
istic is almost guaranteed to be of mixed-phase if any reections (multipath, bridge-taps, gauge-changes,
slight imbalance of terminating impedances) exist.   It is then natural (and correct) to infer that the best
input-aided equalization problem is one that chooses B(D) and W(D) to be both mixed-phase also.  The
problem then becomes for a nite-length feedforward section (W(D)) of  M + 1 taps and a nite-length
channel model of   + 1 taps (B(D)), to nd the best  B(D) and  W(D) for an ML detector  designed for
B(D), the signal constellation, and additive white Gaussian noise (even if the noise is not quite Gaussian
or white).   This is the problem addressed  in Section 4.8.2 and the solution is often neither MMSE-DFE
nor MMSE-ARE ltering.
4.8.2   The  nite-length  TEQ
Like the DFE of Section 3.6, the nite-length MMSE-TEQ problem is characterized  by the error
e
k
() = b
x
k
 w
y
k
  (4.313)
where
x
k
=
_
_
x
k
x
k1
.
.
.
x
k
_
_
  ;   (4.314)
b
  
=  [b
0
  b
1
  ...  b
  
=  [w
0
  w
1
  ...  w
L1
],
L is the number of equalizer coecients  and may be at another sampling rate as in Section 3.6 for the
fractionally spaced case
28
; and
y
k
=
_
_
y
k
y
k1
.
.
.
y
kL+1
_
_
  .   (4.315)
The  length    is  typically  less   than  the  length  of   the  channel   pulse  response   and  equal   to  the  value
anticipated for use in a multichannel modulation system like DMT/OFDM or VC.
The input/output relationship of the channel is
y
k
 = Px
k
 + n
k
  .   (4.316)
28
The  equalizer output should  be  decimated to  some  integer multiple  of  the  channel-input sampling rate.   Usually  that
integer multiple is  just  unity  in  DMT  and  vector coding designs because that  sampling  rate is  already greater than  twice
the highest frequency likely to be used.   However, as  Pal has noted, with  very short equalizers, sometimes an oversampling
factor greater than 1  can  improve performance for  xed  complexity, depending on  channel details.
375
P  need not be a convolution matrix and can reect whatever guard bands or other constraints have been
placed upon the transmit signal.  However, in most developments, this would render the channel output
cyclostationary, so  we will assume  P  is a  convolution matrix of the  exact  form discussed  for DFEs in
Chapter 3.
The  MSE is minimized with a constraint of |b|
2
=  c a constant, with  c usually chosen  as  c = |p|
2
to  force  the  at-input  SNR
MFB
  of  the  channel   to  be  the  at-input  SNR
MFB
  of  the  TEQ  problem.
However,  this non-zero  constant  produces  the  same TEQ  shape  and does  not  change performance  but
simply avoids the trivial all-zeros solution for  w.   Minimization rst over  w forces
E[e()y
k
] = 0   .   (4.317)
The solution forces
w
= b
R
xy
()R
yy
1
(4.318)
for any  b.   Then, e() becomes
e() = b
(x
k
R
xy
()R
yy
1
y)   .   (4.319)
(4.319) has a vector of the minimized MSE nite-length LE errors within the parentheses.   This MMSE-
LE error vector has an autocorrelation matrix that can be computed as
R
LE
() = c
x
I  R
xy
()R
yy
1
R
yx
()   .   (4.320)
Tacit  in (4.320) is  the  assumption  that  R
yy
  is  a  xed  L  L  matrix,  which  is  equivalent to  assuming
that  the  input  to  the  channel   is  stationary  as  is  the  noise.   In  the  case  of  transmission  methods  like
DMT  and  VC,   this   stationarity  is  not   true   and  the   input   is   instead  cyclostationary  over   a  symbol
period.   Furthermore,   the  initial solution  of  the  TEQ  is  often  computed  before  transmit  optimization
has occurred  so that  R
xx
  is often assumed  to be white or a diagonal matrix as will also be the case in
this development.   With these  restrictions/assumptions, the overall MMSE is then the minimum of
MMSE = min
b,
b
R
LE
()b   .   (4.321)
with the constraint that |b|
2
= c.  A biased matched-lter bound SNR for the TEQ-equalized channel
is then
SNR
MFB
(bias) =
  
c
x
|b|
2
min
|b|
2
  = 1 + SNR
MFB
(unbias)   ,   (4.322)
as with all MMSE equalizers, there is a bias that is removed by scaling the equalizer output by
SNR
MFB
(bias)
SNR
MFB
(unbias)
  .   (4.323)
However, such bias removal is not necessary  in the DMT systems because of the ensuing FEQ that will
remove it automatically in any case.   The solution of this problem is easily shown to be the eigenvector
of  R
ee
  that corresponds to the minimum eigenvalue.   Thus
b =
 
c  q
min
()   .   (4.324)
Thus, the  TEQ  setting is determined  by computing  R
ee
() for all reasonable values of  and nding
the MMSE () with the smallest value.   The eigenvector  corresponding to this  is the correct  setting
for  b (with scaling by
 
c) and  w, the TEQ, is determined through Equation (4.318 ).
An alternative notes that equation (4.320 can also be written
R
LE
() = c
x
I 
  
c
2
x
[010] P
R
yy
1
P [010]   ,   (4.325)
where the position of the 1 determines .  Then, the best delta corresponds to the maximumeigenvalue
of the matrix  P
R
yy
1
P.
376
   0   1   2   3   4   ...   
c
lost
  4.26   3.45   2.99   2.26   1.83   ...   0
dB loss =
  5.26Llost
5.26
  -7.2   -4.6   -3.6   -2.4   -1.9   ...   0
Table 4.6:  Table of energy loss by truncation (rectangular windowing) of  p
k
 = .9
k
 u
k
.
The  L  L matrix  R
yy
  need only be  inverted once, but the eigenvectors  must be computed several
times corresponding to all reasonable values of .  This computation can be intensive.   For FIR channels,
the matrix  R
yy
  is given by
R
yy
  =   PR
xx
P
 +R
nn
 general non-white input case   (4.326)
=
  
c
x
PP
 +R
nn
 usual white-input case.   (4.327)
Any  stationary  R
xx
  can  be  assumed.   If  a  nonstationary  (perhaps  an  R
xx
  representing  some  known
input  optimization and  cyclostationarity)  R
xx
  is  used,   then  R
yy
  becomes  cyclostationary  also  and  a
function of both position within a symbol.  The cross correlation matrix is
R
xy
(D)() = E[x
k
y
k
] = E[x
k
x
k
] P
  .   (4.328)
The cross-correlation matrix will have zeros in many positions determined by  unless again an optimized
input is being used and then again this matrix would also become a function of both  and the position
within a symbol.   Because  it is hard to know which position within a symbol to optimize or to know in
advance the optimized spectrum  used on the channel (which usually is aected by the TEQ), the TEQ
is set rst prior to channel identication, partitioning, and loading.  It could be periodically updated.
EXAMPLE  4.8.1  (IIR  Channel 1/1 .9D)  A discrete-time  channel might have an IIR
response  so that no nite    is ever  completely sucient.   An example is the  single-pole IIR
digital channel
P(D) =
  1
1 .9D
  ,   (4.329)
with time-domain response  easily found to be
p
k
 = .9
k
   k  0   .   (4.330)
The gain of the channel is |p|
2
=
  1
1.9
2
  = 5.2632. Let the AWGN variance per real dimension
be   
2
=  .1  and  the  input   energy  is
  
c
x
  =  1.   If   PAM  were   used  on  this   channel,   then
SNR
MFB
 = 17.2 dB. It is interesting to investigate the amount of energy with guard period
of    samples as    varies.   The total energy outside of a guard period of length    is then
|p|
2
k=0
1  (.81)
+1
1 .81
  .   (4.331)
This lost energy, which becomes distortion for the DMT or VC system is in Table 4.6
To  have  only  .1  dB  loss,   then     >  16,   meaning  N  >  500  for   small   loss   in  energy  and
bandwidth.   Alternatively, one can use a TEQ, which will reshape  the  channel so that most
of the energy is in the rst    coecients.   Because  of the special nature of a single-pole IIR
channel,   it  is  pretty  clear  that  a  3-tap  equalizer  with  characteristic  close  to  1  .9D  will
reduce   substantially while altering the noise.   Thus, it is fairly clear that an equalizer with
L = 3 taps and delay  = 0 should work fairly well.   We select   = 1 also for this example.
For these  values, some algebra leads to
R
xy
(D)(0) =
_
  1   0   0
.9   1   0
_
  (4.332)
377
and
R
yy
 =
_
_
5.3636   .9(5.2636)   .81(5.2636)
.9(5.2636)   5.3636   .9(5.2636)
.81(5.2636)   .9(5.2636)   5.3636
_
_
  .   (4.333)
The error autocorrelation matrix is
R
ee
 =
_
  .1483   .0650
.0650   .1472
_
  ,   (4.334)
with eigenvalues .2128 and .0828 and corresponding eigenvectors [.7101 .7041] and [.7041 .7101]
respectively.   Then  R
ee
  has   smallest  eigenvalue  .0828  with  corresponding  MMSE=.4358,
much less  than the  
2
+ 3.45 =  .1 + 3.45 = 3.55 (3.45 is  the  entry  for  energy  loss  in Table
4.6 above when   = 1) for no equalization.  Obviously the equalizer reduces distortion signif-
icantly with  respect  to  no use  of equalizer.   The  corresponding  setting  for  b from Equation
(4.324) is
B(D) = |p|  [.7041 .7101] =
 
5.2632  [.7041 .7101] = 1.6151  [1 + 1.0084D]   .   (4.335)
B(D)   is   not   minimum  phase   in  this   example,   illustrated  that   the   TEQ  is   not   simply  a
DFE  solution.   The  corresponding  value  for  w  from  Equation  (4.318)  leads  to  W(D)  =
1.4803  (1 + .1084D .8907D
2
) and a new equalized channel
z(D)   =   1.4803 
_
1 + .1084D .8907D
2
_
  X(D)
1 + .9D
 +W(D)  N(D)   (4.336)
=   1.4803 [1 + (.1084 + .9)D] + 1.4803  (.81 +.9  .1084  .8907) 
  D
2
 X(D)
1 + .9D
  + W  N
=   1.4803 [1 + (.1084 + .9)D]
.      .
w
0
b
0
  B(D)
+1.4803(.01686) 
  X(D)  D
2
1 + .9D
  +W(D)  N(D)
.      .
w
0
b
0
  E(D)
.
The bias in estimating B(D) is w
0
/b
0
 and can be removed by multiplying the entire received
signal by the  constant  b
0
/w
0
  without altering performance   this  step  is usually not  neces-
sary in DMT systems  because  the multichannel normalizer automatically includes any such
adjustment.   The PSD of the distortion can be determined as
R
EE
(D) 
_
w
0
b
0
_
2
=   1.4803
2
 (.01686)
2
  1
1 .9
2
 + .1  W(D)  W
(D
)   (4.337)
=   1.4803
2
_
(.001496) + .1 
_
1 + .1084
2
+.8907
2
_
  (4.338)
r
EE
(0) 
_
w
0
b
0
_
2
=   .3988 = .4538 
_
w
0
b
0
_
2
.   (4.339)
The ratio of signal energy in the rst two terms to distortion is then 1.4803
2
(1+1.0084
2
)/.3988
or 10.4 dB. This signal to noise ratio cannot be signicantly improved with longer equalizers,
which can be vered  by trial of larger lengths, nor is    > 1 necessary  as one can determine
by comparing with an innite-length MT result for this channel.
This new channel plus distortion can be compared against the old for any value of  N  in VC
or DMT, with an obvious clear improvement, as in the homework.  The equalized channel is
shown in Figure 4.34 where it is clear the rst two positions contain virtually all the energy.
Figure 4.35 shows the power spectrum  of the error, which in this case is almost at (see  the
vertical  scale)  so  one  could  apply  loading  directly  to  the  subchannels  as  if  the  noise  were
white  and  of  variance  .3988  without  signicant  error.   Without  this  TEQ,  the  loss  in
performance nearly 10 times higher.
378
Figure 4.34:  Equalized response  for 3-tap TEQ with 1/(1  .9D) channel.
Figure 4.35:  Distortion Spectrum for 3-tap TEQ on 1/(1 .9D) channel with white noise at  
2
= .1.
379
The TEQ performance increase over no equalizer can indeed be much greater than 4.2 dB on channels
with more severe responses.   (See for instance Salvekars Marathon in Problem 4.25.)  The TEQ is not
a DFE and does not maintain white noise nor channel magnitude in the nite-length case.
The issue of bias can return in MMSE design and is somewhat more involved than in Chapter 3.   It
was removed as a part of Example 4.34.  To calculate the correct  unbiased distortion energy in general
E
 X(D), then
E
B(D) D
 X(D)   E
U
(D)   (4.342)
=   (1 )  X(D)  B(D)  D
   E
U
(D)   .   (4.343)
Bias  removal  then  would  occur  by  multiplying  the  equalizer  output  by  1/,   which  would  more  than
likely be absorbed into the equalizer as in the example.   The bias factor itself is obtained by rst forming
the convolution  C(D) = W(D)  P(D), which is the noise-free equalized output.   The bias factor diers
slightly with  B(D) ,= 1 from the simple SNR/(SNR + 1) of Chapter 3.   The bias will then be the ratio
 = c
/b
0
  .   (4.344)
To compute the  r
u,E
  (4.343) leads to
MMSE   =   (1  )
2
  
c
x
  |b|
2
+
2
 r
u,E
  (4.345)
=   (1  )
2
  
c
x
  |p|
2
+
2
 r
u,E
  (4.346)
=   
min
|p|
2
.   (4.347)
Thus,
r
u,E
 =
 |p|
2
_
min
(1 )
2
  
c
x
2
  (4.348)
provides the value for r
u,E
.  The unbiased SNR
MFB
 for the equalized channel with  included to remove
bias is then
SNR
MFB
 =
  
c
x
 |p|
2
r
u,E
.   (4.349)
The  following second  example  shows  a  near  equivalence  of  a  DFE  and  an  equalized  DMT  system
when both use exactly the same bandwidth, a concept explored and explained further in Chapter 5.
EXAMPLE  4.8.2  (Full-band TEQ)  A 7-tap FIR channel is shortened to   = 3 (4 taps)
with an 11-tap equalizer in this example and then  DMT applied.   The  channel is such  that
all 128  tones  of  the  DMT  system  are  used.   This  is  compared  against  a  MMSE-DFE  with
20 total taps (14 feedforward, 6 feedback).   The matlab commands are inserted here and the
reader can then follow the design directly:
>>  [snr,w]=dfecolor(1,p,14,6,13,1,[.1  zeros(1,13)])
snr  =   16.8387
w  =   -0.0046   0.0056   0.0153   -0.0117   -0.0232   0.0431   0.0654   -0.0500   -0.0447
0.2263   0.2748   -0.0648   0.1642   -0.1479
feedback  is  -0.0008   -0.7908   0.0001   0.0473   0.0001   0.1078
------------------------------------------------------------------------------------------------
>>  p=[-.729  .81  -.9   2  .9   .81  .729  ];
>>  np=norm(p)^2
np  =   7.9951
>>  P=toeplitz([p(1)  ,  zeros(1,10)],[p,  zeros(1,10)]);
380
some  entries  in  P   are:
P  =   Columns  1   through  8
-0.7290   0.8100   -0.9000   2.0000   0.9000   0.8100   0.7290   0
0   -0.7290   0.8100   -0.9000   2.0000   0.9000   0.8100   0.7290
0   0   -0.7290   0.8100   -0.9000   2.0000   0.9000   0.8100
0   0   0   -0.7290   0.8100   -0.9000   2.0000   0.9000
0   0   0   0   -0.7290   0.8100   -0.9000   2.0000
0   0   0   0   0   -0.7290   0.8100   -0.9000
>>  ryy=P*P+.1*eye(11);
The delay  after some experimentation was found to be  = 10 for an 11-tap TEQ on this
particular channel.   This example continues then in Matlab according to:
>>  rxy=[  zeros(4,10)  eye(4)  zeros(4,3)]*P
rxy  =   Columns  1   through  8
0   0   0   0   0.7290   0.8100   0.9000   2.0000
0   0   0   0   0   0.7290   0.8100   0.9000
0   0   0   0   0   0   0.7290   0.8100
0   0   0   0   0   0   0   0.7290
Columns  9  through  11
-0.9000   0.8100   -0.7290
2.0000   -0.9000   0.8100
0.9000   2.0000   -0.9000
0.8100   0.9000   2.0000
>>  rle=eye(4)-rxy*inv(ryy)*rxy
rle=0.0881   0.0068   -0.0786   -0.0694
0.0068   0.1000   -0.0045   -0.1353
-0.0786   -0.0045   0.1101   0.0464
-0.0694   -0.1353   0.0464   0.3666
>>  [v,d]=eig(rle)
v  =-0.2186   -0.5939   -0.7658   -0.1144
-0.3545   0.2811   -0.2449   0.8575
0.1787   0.7319   -0.5695   -0.3287
0.8914   -0.1806   -0.1710   0.3789
d  =   0.4467   0   0   0
0   0.1607   0   0
0   0   0.0164   0
0   0   0   0.0410
>>  b=sqrt(np)*v(:,3)
b  =   -2.1653   -0.6925   -1.6103   -0.4834
>>  w=b*rxy*inv(ryy)
w  =   0.0101  0.0356  -0.0771  -0.1636  0.0718  0.1477  -0.4777  -0.7924  -0.0078  -0.2237  0.1549
The    is  .89836  and  the  SNR
MFB
  =  17.7868  dB  corresponding  to  a  biased  error  energy
381
of .1288 and  unbiased  at  .1331.   A  TEQ  program written  by  2005 student  Ryoulhee Kwak
executes  this procedure and appears at the web page.   By running the program a few times,
one  can  nd  that  by using    = 3  and 14  taps  in the  equalizer  with delay  9, the  SNR
MFB
increases  to 18.9 dB. By then running water-lling, one nds an SNR in the range  of 15 to
16 dB for various TEQs of dierent length on this channel.
Instead, if no TEQ is used with FFT size 1024, then all tones are used and
>>  [gn,en_bar,bn_bar,Nstar,b_bar]=waterfill(b,19.2,1,1024,0);
>>  Nstar  =   1024
>>  b_bar  =   2.8004
>>  10*log10(2^(2*b_bar)  -1)
ans  =   16.8
The performance is the same as the DFE system earlier.  Thus, this second example illustrates
that  the  TEQ  is  not  wisely  used  and  perhaps  a  larger  symbol size  is  better.   Note  the  no-
TEQ  DMT  receiver  has  total  complexity  log
2
(1024)  =  10  per  sample,  while  the  DFE  has
complexity 20.   One  might note  that  the  transmitter  of  DMT  also  requires  10  per  sample.
Both get the same performance for about the same total complexity (transmitter and receiver)
in the case that entire set of tones is used (a curious result developed more in Chapter 5).
4.8.3   Zero-forcing  Approximations  of  the  TEQ
Several designers take an alternative approach to design of the TEQ that they claim is superior to Chows
method.   This  is  incorrect.   The  method  does  not  consider  MMSE  and  instead  looks  at  the  equalized
response  W(D)  P(D) within a window of   +1 samples and outside that window in the other  N 1
samples, which are called the wall.  The heurestic explanation is that energy in the equalizer response
in the wall should at a minimum relative to the energy within the window.  First, this window/wall
approach  tacitly  assumes   that   the   input   power   spectral   density  is   white   S
x
(D)   =
  
c
x
  (or   else   this
criterion makes no sense).   Furthermore, noise is clearly ignored in the approach.   For nite-length TEQ
(or any equalizer) design, zero-forcing solutions cannot exactly force a specic  b.   The MMSE approach
to  equalizer  design  becomes  zero  forcing  for  all  lengths  when  the  SNR  parameter  is  set  to  an  innite
value.   In this  case,  the  mean-square  energy  of the  wall is then  necessarily  minimized relative to the
total energy of
  
c
x
 c = |p|
2
in the MMSE-TEQ when the input is white.   As Ishai astutely noted, the
problem of maximizing window to  wall energy  is the  same  as  maximizing total equalized  pulse  energy
to wall energy.   Thus, all these  approaches (which eventually result in a generalized eigenvalue problem
that must be solved) are the same as the MMSE-TEQ when the designer assumes white input spectrum
S
x
(D) =
  
c
x
  and sets
  A0
2
  = 0.
In fact, one notes that Chows method is at least as computationally ecient as even the best of the
other  methods  (introduced  by Ilani) in that  the  matrix for which an  eigenvector  is  determined  is  only
size ( +1) ( +1) and indeed Chows method need only invert the matrix R
yy
 once.   The eigenvector
must determined for all reasonable values of  in all methods.
EXAMPLE  4.8.3  (Return to 1/(1 +.9D)  channel example of  Section 4.8.2)  A  de-
signer might decide to use one of the various zero-forcing TEQ designs instead on the
  1
1+.9D
channel.  Clearly if noise is ignored, then a best equalizer is w(D) =
 
5.2636(1+.9D), which
attens the channel so that even  = 0 is sucient.  In fact all zero-forcing equalizers are given
by the class W(D) = (a+bD)(1+.9D) where a
2
+b
2
= |p|
2
.  The rst zero-forcing equalizer
has  smallest enhanced  noise  given by  .9527 relative to |p|
2
= 5.2636 for  a ratio of  7.4 dB,
about 3 dB  worse  than including the  eect  of the  white noise.   A  better  choice  of ZF-TEQ
(since there are many in this special case) would be  W(D) = (1.65 1.59  D)  (1 +.9D).
29
,
29
The author found this by minimizing output noise power over choices for a and b, which results in a quadratic equation
for  b  with  solutions b = 1.59  or .626  (a = 2.21).
382
and has ratio instead of 10 dB, which is still less than 10.4 dB. In fact, no higher ratio than 10
dB can be found for the ZF case.   Thus, all ZF-TEQs would all have worse SNR. However,
the  astute  reader  might  observe  loading  should  be  executed  with  the  correct   noise  power
spectral  density  (which is not close  to white  in this example) and indeed  is very  high-pass,
so the set of  g
n
 might be better.   This observation that it is really SNR
dmt
 that is important
then leads to the next subsection.
4.8.4   Kwak  TEQ  Program
Former student Ryoulhee Kwak has provided the TEQ program at the web site that works for both FIR
and (all pole) FIR channel inputs.   The program is
function  [P,ryy,rxy,rle,v,d,w,b,mmse,SNRmfb_in_dB,c]=teq(p,L,mu,delta,sigma,Ex_bar,filtertype)
%   programmed  by   Ryoulhee  Kwak
%
%   filtertype  1=FIR  ,   2=  IIR(just  for  one   pole  filter)
%   p=channel  impulse  response
%   in  case  of  FIR,  p=   1+0.5D+2.3D^2->p=[1  0.5  2.3]
%   in  case  of  IIR,   p=1/(2.4-0.5D)->  p=[2.4  -0.5]
%   sigma  =>noise  power  in   linear  scale
%   delta  =>delay
%   L  =>   number  of  taps
%   c=>  w*p
%   v=>eigen   vectors
%   d=>eigent  values
%mu  =>   for  b
if  filtertype==1%  FIR
norm_p=norm(p);
[temp  size_p]=size(p);
P=toeplitz([p(1),  zeros(1,L-1)],[p,zeros(1,L-1)]);
ryy=Ex_bar*P*P+sigma*eye(L,L);
rxy=[zeros(mu+1,delta)  eye(mu+1)  zeros(mu+1,L+size_p-2-mu-delta)]*P*Ex_bar;
rle=eye(mu+1)*Ex_bar-rxy*inv(ryy)*rxy;
[v   d]=eig(rle);
d_temp=diag(d);
[m,n  ]=   min(d_temp);
b=norm_p*v(:,n);
w=b*rxy*inv(ryy);
conv(w,p);
mmse=b*rle*b;
%by  using  easy  way  to  get   error  energy
c=conv(w,p)
alpa=c(delta+1)./b(1)
biased_error_energy=norm_p^2*(m-Ex_bar*(1-alpa)^2)
unbiased_error_energy=norm_p^2*(m-Ex_bar*(1-alpa)^2)/alpa^2
SNRmfb=norm_p^2*Ex_bar/unbiased_error_energy;
SNRmfb_in_dB=10*log10(SNRmfb)
else  %IIR
%in  case  of   IIR  filter  P   matrix  is  infinite  dimesion  but  there  is   a  trick  to  get  exact  rxy,  ryy
norm_p=sqrt(1/(p(1)^2*(1-(p(2)/p(1))^2  )));
383
[temp  size_p]=size(p);
ptemp=[(-p(2)/p(1)).^(0:1:L-1)]/p(1);  %-1  factor!
P=toeplitz([ptemp(1)  zeros(1,L-1)],ptemp);
Ptemp=toeplitz(ptemp,ptemp);
ryy=Ex_bar*Ptemp*norm_p^2+sigma*eye(L,L);
rxy=[zeros(mu+1,delta)  eye(mu+1)  zeros(mu+1,L-1-mu-delta)]*P*Ex_bar;
rle=eye(mu+1)*Ex_bar-rxy*inv(ryy)*rxy;
[v   d]=eig(rle);
d_temp=diag(d);
[m,n  ]=   min(d_temp);
b=norm_p*v(:,n);
w=b*rxy*inv(ryy);
c=conv(w,p);
sum(conv(w,p).^2)-norm(b)^2*(w(1)/b(1))^2;
mmse=b*rle*b;
%by  using  easy  way  to  get  error  energy
alpa=c(delta+1)./b(1)
biased_error_energy=norm_p^2*(m-Ex_bar*(1-alpa)^2)
unbiased_error_energy=norm_p^2*(m-Ex_bar*(1-alpa)^2)/alpa^2
SNRmfb=norm_p^2*Ex_bar/unbiased_error_energy;
SNRmfb_in_dB=10*log10(SNRmfb);
end
4.8.5   Joint  Loading-TEQ  Design
Usually TEQs are designed  (or trained) before  loading in DMT or VC systems,  and an assumption of
initially at input spectrum  S
x
(D) =
  
c
x
 is then made.   For OFDM, this assumption is always true, but
when loading is used in DMT, then it will likely not be true once loading is later computed.   In fact, an
optimum design would maximize SNR
dmt
  over the TEQ settings as well as the c
n
 for each subchannel.
Al-Dhahir was the rst  to approach this problem and attempt solution.   A yet better  solution recently
emanated from Evans (see December 2001 IEEE transactions on DSP). While Evans beautifully sets up
the  problem,  he  notes  at  the  end  that  it  requires  very  dicult  and  advanced  optimization procedures
to  be  exactly  solved, and  he  further  notes  that  such  solution is  not  feasible  in actual  implementation.
Evans then reverts  to the MMSE solution and shows it is often close to the MAX-SNR
dmt
  solution.
This subsection poses the following solution:
1.   Initially design MMSE-TEQ for at or white input.
2.   Execute loading (water-lling or LC) and compute SNR.
3.   Redesign the MMSE-TEQ using the new  R
xx
.
4.   Re-loading the channel and see if SNR
dmt
  has signicantly improved.
(a)   If improved, return to step 3 with the latest  R
xx
  from loading.
(b)   If not, use the TEQ with best SNR
dmt
  so far determined.
Clearly with dynamic loading, the  movement of energy could cause the  TEQ to be somewhat de-
cient.  This would be particularly true if narrow-band noise occurred after initial training.  The procedure
can then be repeated.   The TEQ change would need to occur on the same symbol that synchronizes the
change of energy and bit tables in the corresponding modems.
For  those  interested  in  extra  credit,   this  method  might  be  executed  for  the  1/(1 + .9D)  example
above and compared with MMSE approach as well as the method given by Evans.
384
Figure 4.36:  Kasturias Block DFE.
4.8.6   Block  Decision  Feedback
The overhead of the extra   dimensions of vector coding and and DMT/OFDM is negligible for large N.
When large  N  leads to unacceptable delay in the transmission path, smaller N  may have to be used.   In
this case, the extra  samples between symbols (blocks of samples) may be eliminated by what is known
as Kasturias Block Decision Feedback Equalizer.
Block Decision Feedback uses the decisions from the last block of samples to reconstruct  and cancel
the interference from previous symbols.  The pulse response is known in the receiver so that the eect of
the  last    columns of  P  can  be  subtracted  as in Figure 4.36.   Assuming previous decisions  are  correct,
the resultant vector channel description is
y =
  
Px +n  ,   (4.350)
where
  
P  is   the   N   N  matrix  containing  the   left-most   N  columns   of   P.   P  must   be  converted  to
a  minimum-phase  (MMSE  sense)   channel   by  a  conventional   MMSE-DFE  feedforward  lter   for   best
performance.   Vector  coding  is  applied  to  this  matrix  channel  with  SVD  performed  on  the  matrix
  
P.
While simple in concept, the subtraction of previous symbols in practice can be complex to implement.
4.8.7   Cheongs  Cyclic  Reconstruction  and  Per-tone  Equalizers
K. W. Cheong in his 2000 dissertation and earlier results used the concept of cyclic reconstruction (rst
introduced for echo cancellers by Ho and Cio) to simplify the concept of a full  N N  block equalizer
that  would  be  equivalent  to  a  separate  TEQ  for  each  tone.
30
Van  Acker  and  Mooenen  (and  van  de
Wiel and Pollet) later renamed this matrix equalization a per-tone equalizer and used the equivalent
of cyclic reconstruction to simplify the amount of computation (apparently unaware of Cheongs earlier
work  and  the  equivalence  to  cyclic  reconstruction).   The  method  requires  a  slightly higher  complexity
than  a  TEQ  in  real-time,  but  does  not  require  eigen-vector  computation.   This  method  actually  then
leads to an easy adaptive implementation that maximizes product SNR.
The key observation in cyclic reconstruction is that the cyclic prex could be faked to look longer if
at each sampling instant that is not already part of the cyclic prex the value from the previous symbol
30
Clearly  having  a  separate  equalizer  for   each  tone  allows   the  individual   SNRs   to  be  maximized  and  then  also  the
product SNR  to  be  maximized  given a  certain input energy distribution, essentially decoupling the  loading problem from
the  equalization problem.
385
Figure 4.37:  Cheongs Cyclic Reconstruction (sometimes called a per-tone equalizer).
were  somehow replaced  by the  sample  from the  current  symbol.   Essentially, the  cyclic  prex  becomes
innite so then the FEQ would be sucient for all equalization on the resultant periodic system.
If  the  receiver  saves  the  last  symbol  of  samples  and  they  were  correct,   then  the  ISI  from  the  last
sample could be subtracted and then an ISI equivalent to the repeated sample could be added.   Cheong
used  this  in  a  basic  Block  DFE  structure  to  allow  simplication  and  retain  use  of  the  FFTs  (unlike
Kasturias earlier block DFE). At some further increase in complexity, it is possible (and obvious) that
this  can  be  implemented  also  in  the  time  domain  using  instead  the  channel   outputs.   A  truly  cyclic
channel  would  have  outputs  that  are  also  cyclic.   Any  deviations  from  such  cyclic  behavior  would  be
completely represented  by the dierences  between points separated by  N  samples.
Thus a set of dierences  can be computed by the receiver  or
The basic observation is the nominal FFT has  N  values
(y
N+i1
 y
i1
) i = 1, ..., N  .   (4.351)
Any  deviation  from  perfect   cyclic  channels   must   be  a  linear  combination  of   these   dierences   on
a  linear  channel.   Thus,  one  implementation is  shown  in  Figurerefcheong  The  coecients  are  adapted
according to the same algorithmas in (4.133) using the same error signal
  
U
n
 but with each FEQ coecient
augmented  by  a set  of  N  taps  that  have the  time-domain dierences.   This remains  the  complexity of
N  N.   However,  dierences  far  away  from  the  cyclic  prex  will  have  negligible contribution.   If  the
(FIR)  TEQ  for any (or  all) tones  were  L  taps  long, then  clearly  it has  only  L  degrees  of freedom  and
could be aected  by at most 2L dierences.
Thus   the   complexity  is   essentially  double  that   of   an  L-tap  TEQ  but   also  has   the   advantage  of
essentially being a dierent equalizer for each tone output.  Moonen and Acker developed what they call
a per-tone  equalizer well after the Kasturia and Cheong work, but did not realize the equivalent use
of cyclic reconstruction, thus the apparent dierent name.  However, the concept is clearly and obviously
equivalent when  viewed  from the  perspective  of cyclic  reconstruction.   These  2L  samples  are  the  same
386
for  all  tones  but  do  depend  on  the  delay    chosen.   In  a  system  with  no  delay,  they  would  be  the  2L
samples closest to the previous symbol.   With a delay delta, the 2L    samples closest to the previous
symbol and the    samples at the end of the symbol would be used.   Such a system requires collection of
the next symbol to compute the dierences  (or at least   output samples from the next symbol) and so
has between    samples and one symbol of additional delay required for implementation.
An  adaptive  implementation would use  the  ZF  algorithm used  for  the  FEQ  on  each  tone  with the
single complex coecient now being replaced by 1 + 2L dierence  terms.
387
4.9   Windowing  Methods  for  DMT
While  cyclostationary
31
DMT/OFDM  signals  are  not  deterministically  periodic  over  all  time  simply
because the transmitted input message is not the same in each symbol period.   Deterministically for any
tone, the corresponding output signal for a given symbol period is
x
n
(t) = X
n
 e
2
N
 
N+
T
  nt
 w
T
(t)   ,   (4.352)
where  w
T
(t) is a window function.  In the DMT/OFDM partitioning of Section 4.6, the window function
was always
w
T
(t) =
_
  1   t  [0, T)
0   t , [0, T)
  .   (4.353)
Letting
f
  
=
_
1 +
  
N
_
  1
T
  = 
  1
NT
t
  ,   (4.354)
the product in (4.352) corresponds to convolution in the frequency domain of the impulse  X
n
 (f 
  
f)
with  W
T
(f).   The fourier transform of the rectangular window in (4.353) is
W
T
(f) = sinc
_
f
f
_
 e
fT
,   (4.355)
which has the magnitude illustrated in Figure 4.38.  While there is no interference between tones of DMT
because  of the  zeros  in this function, at  intermediate frequencies  (f ,=  n 
  
f), there  is non-zero  energy.
Thus the DMT spectrum is not a brick-wall bandpass system of tones, but decays rather slowly with
frequency  from  any  tone.   Such  slow  decay  may  be  of   concern  if   emissions  from  the  present   channel
into  other  transmission  systems
32
are  of concern.   Even  other  DMT  transmissions  systems  that  might
sense  crosstalk from the  current  DMT system would need  exact  synchronization of sampling rate  1/T
t
to avert such crosstalk.   A slight dierence in DMT sampling clocks, or just another modulation method,
on the  other transmission system  might lead to high interference.   For this  reason, some DMT/OFDM
modulators  use  a  dierent  window  than  in  (4.353)  to  reduce  the  sidelobes  with  respect  to  those  in
Figure 4.38.
Non-rectangular  windowing  of  the  transmit  signal  in  DMT  could  re-introduce  interference  among
the tones, so must be carefully introduced.   Figure 4.39 illustrates two methods for introducing windows
without reintroduction of intra-symbol interference through the use of extra cyclic prex and sux.   The
cyclic sux used for Zippered duplexing in Section 4.6 is also useful in windowing.  Extra prex samples
beyond  those  needed  for  ISI   (that  is  larger  than  the  minimum    required)   and  extra  sux  samples
(beyond  either  0  with no Zippering,  or  larger than  the  minimum  required  with Zippering)  are  used
for  the  shaped  portion  of  the  window.   The  minimum  N  +  +   samples  within  the  symbol  remain
unwindowed  (or  multiplied  by  1,   implying no  change)  so  that  DMT  or  OFDM  partitioning  works  as
designed when the receiver  FFT processes  only  N  of these  samples.   The remaining samples are shaped
in  a  way  that  hopefully  reduces  the  sidelobes  of  the  window  function,   and  thus  makes  the  frequency
response  of  the  modulated  signal  fall  more  sharply.   This  sharper  fall  is  achieved  by  having  as  many
continuous derivatives as is possible in the window function.   The windows from the extra prex of the
current symbol and the extra sux of the last symbol can overlap without change in performance if the
sum of the two windows in the overlapping region equals 1.  Such overlap of course reduces the increase
in excess  bandwidth introduced by the window.
The most common choice of window in practice is a raised-cosine window of 2L+1 samples of extra
cyclic extension (extra prex plus extra sux).   This function is given in the time domain by
w
T,rcr
(t) =
  1
2
 
_
1 + cos
_
_
  t
LT
t
___
t  (LT
t
, LT
t
)   (4.356)
31
DMT/OFDM  signals are  stationary at  symbol  intervals, but  cyclic over a  symbol  period.
32
That  may  share  the same  channel or  unfortunately sense radiation from  it.
388
Figure 4.38:  Rectangular window fourier transform versus
  f
f
  .
Figure 4.39:  Illustration of extra prex and sux and non-overlapped and overlapped windows.
389
where time t = 0 is in the middle of the extra extension.   The actual implementation occurs with samples
w
k
 = w
T,rcr
(kT
t
) for  k = L, ..., 0, ..., L. Letting
  =
  L
N +  + 
  ,   (4.357)
the fourier transform of this window becomes
W
T,rcr
(f) =
  L
 
  sinc
_
f
f
_
cos
   f
f
1 
_
2 f
f
_
2
  .   (4.358)
Figure 4.40 illustrates the sidelobes of the raised cosine function for 5% (blue) and 25% (red) along with
those of W
T
(f) from Figure 4.38 (this time in decibels).   Emissions into other systems can be signicantly
reduced for between 5% and 25% excess bandwidth.  The FMT systems of Section 4.6 oer even greater
reduction (or can be combined) if required.
4.9.1   Receiver  Windowing
Also  of   interest   are  any  non-white  noises  that  may  have  autocorrelation  of   length  greater  than  T
t
.
Extremely  narrow  band  interference  noise,   sometimes  called  RF  interference  is  an  example  of  such
a  noise.   DMT  block  length  N  may not  be  selected  suciently  large  and  the  TEQ  of  section  4.8 may
not  adapt  quickly  enough  to  notch  or  reduce  such  noise,   or  the  TEQ  may  not  be  suciently  long
to  reduce  the  noise  level.   The  same  extra  prex  and  sux  interval  in  the  receiver  can  again  be  used
to  apply  a  window  (perhaps  the  same  as  used  in  the  transmitter).   Windowing  at  the  receiver  again
corresponds to frequency-domain convolution with the transform of the window function.  Low sidelobes
can reduce any narrow-band noise and improve all the g
n
 (and particularly those close to the narrowband
noise).   Whether  used  for transmitter,   receiver,  or  both,  Figure 4.40 shows  that  the  rst  few  sidelobes
are about the same no matter how much excess bandwidth (extra cyclic extension) is used.   However, for
somewhere  between  5% and 25% excess  bandwidth, 20 dB and more rejection of noise energy  coupling
through  sidelobes  is  possible,  which  may be  of considerable  benet  in certain  applications.   Of  course,
this extra bandwidth is the same as any excess bandwidth used by the transmitter (so excess bandwidth
does not increase  twice for transmitter and receiver windowing).
390
Figure 4.40:   Illustration of transforms of rectangular window (green)  and raised-cosine  with 5% (blue)
and 25% (red) excess bandwidth versus (f 
  
f)/
f.
391
4.10   Peak-to-Average  Ratio  reduction
The peak-to-average ratio of  an  N-dimensional constellation was dened in Chapter 1 as the ratio
of  a maxium magnitude constellation  point  to  the  energy  per  dimension.   That  denition  is  useful  for
N  = 1 and N  = 2 dimensional constellations.  For the higher dimensionality of DMT and OFDM signals
(or vector coding or any other partitioning), the maximum time-domain value is of practical interest:
x
max
 =   max
k ,  k=0,...N+1
[x
k
[   .   (4.359)
This value determines  the  range of conversion  devices  (DACs and ADCs)  used  at the interfaces  to the
analog channel.   The  number of bits needed  in the conversion device  often determines  its cost  and also
its  power   consumption.   Thus,   smaller  is  usually  better   (especially  at  high  speeds).   A  higher   ratio
x
max
/
c
x
  leads  to  more  bits  in  such  a  device.   Also,   subsequent  analog  circuits  (ampliers,  lters,   or
transformers)  may exhibit increased  nonlinearity as  x
max
  increases.   Finally, the power consumption of
linear ampliers depends on the peak power, not the average power.   In the analog portion of the channel
between conversion devices, actually  x
max
  should be redened  to be
x(max) =  max
t[0,T)
[x(t)[   ,   (4.360)
which can exceed  the  discrete-time  x
m
ax.   High-performance transmission systems  necessarily  increase
the peak voltages in theory, so the PAR for the transmitted signal x(t) (and/or x
k
) can become important
to designers.
This   section  investigates   this  peak  value  and  shows   that   additional   digital  signal   processing  can
render the nominally in theory large peak inconsequential in practice.
Subsection 4.10.1 begins with some fundamentals on the probability of peaks and their relationship to
data rate, showing that indeed very little data rate need be lost to eliminate large peaks in transmission.
Subsection 4.10.2 then progresses to describe the most widely used and simplest PAR-reduction method
known  as  tone  reservation  (developed  independently  by  former  379  students   and  their  colleagues!)
where a few tones are used to carry peak annihilating signals instead of data.  This method essentially
requires   no  coordination  between  transmitter   and  receiver   and  thus   is   often  preferred.   The   loss   of
tones is somewhat objectionable, so for more complexity and some receiver-transmitter understanding,
the  equally  eective   tone-injection  method  appears   in  Subsection  4.10.3.   This   method  causes   no
data  rate  loss  and  instead  cleverly  inserts  signals  invisible to  the  data  connection,   but  that  annhilate
peaks.   Subsection  4.10.4 discusses  the  most complicated nonlinear-equalization methods  that take the
perspective  of allowing clipping of signals to occur and then essentially remove the clips at the receiver
through  an  ML  detector  that  includes  the  clips.   All the  approaches  presented  in  this  section  reduced
the probability of nonlinear distortion to a level that can be ignored in practice.
33
A  series   of   coded  approaches   have  also  been  studied  by  various  authors,   where  points  in  the   N-
dimensional constellation that would lead to time-domain large  x
max
  are prohibitted from occuring by
design.   Unfortunately, these approaches  while elegant mathematically are impractical in that they lose
far too much data rate for eective  performance,  well above theoretical minimums.   This book will not
consider them further.
4.10.1   PAR  fundamentals
This book chooses to focus on real signals  x
k
  so that  X
Ni
 = X
i
  for each symbol.   For complex signals,
the   theory  developed  needs   to  be   independently  applied  to  the   real   dimension  and  to  the   complex
dimension.
34
33
That  is  a  level at  which  outer forward error-correction codes,  see  Chapters 10  and  11,  will  remove  any  residual errors
that occur  at  extremely infrequent intervals.
34
In  which  case,   the  signal  processing  is  independently  applied  to  the  even  part  of  X,   X
e,i
=  .5  (X
i
  + X
Ni
)  that
generates the real part of x
k
  and then again to the odd part of X,  X
o,i
= .5 (X
i
j  X
ii
) that generates the imaginary
part of  x
k
.
392
First, this subsection  considers  the  discrete-time  signal  x
k
  and generalizes  to over-sampled versions
that approximate  x(t).   The time-domain signal  x
k
  is again
x
k
 =
N1
n=0
X
n
 e
2
N
  nk
,   (4.361)
which  is  a  sum  of  a  large  number  of  random  variables.   Losely  speaking  with  the  basic  Central  Limit
Theorem or probability, such a sum is Gaussian in distribution and so has very small probability of very
large values.  For the Gaussian distribution, a value with 14.2 dB of peak-to-average ration (Q(14.2dB))
has a probability of occurrence  of 10
7
and a PAR of 12.5 dB has occurrence  10
5
.   When such a peak
occurs, with high-probability large clip noise occurs  if the conversion devices or other analog circuits
cannot  handle  such  a  large  value.   So,   with  probabilities  on  the  order  of  the  probability  of  bit-error,
entire DMT/OFDM symbols may have many bit errors internally unless the PAR in the design of such
circuits accommodates very high PAR. Typically, designers who use no compensation set the PAR at 17
dB to make the occurrence  of a peak clip less likely than the life-time of the system (or so small that it
is below other error-inducing eects).
One  might  correctly  note  that  the  same  high-peak  problem  occurs   in  any  system  that  uses   long
digital lters (i.e., see the transmit-optimizing lters for QAM and PAM systems in Chapter 5) will have
each  output  sample  of  the  lter  equal  to  a  sum  of  input  random  variables.   For  eective  designs,   the
lter is long and so the Gaussian nature of actual transmitted signal values is also present.   Furthermore
aside from lters or summing eects, best coding methods (particularly those involving shaping) provide
transmit signal and constellation distributions  that approximate Gaussian (see  Chapter  8).   Thus,  one
can nd any well-designed transmission system having the same high PAR problem.  However, solutions
to  this   problem  are  presently  known  only  for  the   DMT/OFDM  systems   (which  with  the   solutions,
actually then have a lower PAR than other systems).
At  rst  glance,   it  might  seem  prospects   for  high  performance  in  practice  are  thus  dim  unless  the
designer can aord the higher dyanmic range and power consumption of conversion and analog circuits.
However,   let  us  assume  a  PAR  of  say  10  dB  is  acceptable  (even  sinusoids  in  practice  require  3-4  dB
so  this  is  say  one  bit  of   range
35
more  than  a  non-information-bearing  signal   would  require.   Such  a
peak in a Gaussian distribution occurs with probability 10
3
.   Let us suppose that internal to the data
transmission stream, a small amount of overhead data rate was allocated to capturing the size of the clip
that occurs in analog circuits and sending a quantized version of that size to the receiver.   No more than
20 bits per  clip would be required  to convey a very accurate description, and perhaps  just a few bits if
designs  were  highly ecient.   Those  twenty  bits  would represent  on average  only 2% of  the  data  rate.
Clearly not much is lost  for imposing maximums on  the  transmitted range  of quantities, at least  from
this trivial fundamental perspective.   Such an approach might have signicant delay since the overhead
bit stream would be very low with respect  to other clocks in the design.   Nonetheless,  it fundamentally
indicates that signicant improvement is possible for a small loss in overall data rate.
This is exactly the observation used in the three following approaches of this section.
4.10.2   Tone  Reservation
Tone  reservation  allows the  addition of  a  peak  annihilator  c  to  the  transmit  signal   x  on  each  sym-
bol.   Thus,   x + c  is  cyclically  extended  and  transmitted.   The  peak-annihilator  attempts  to  add  the
negative  of   any  peak  so  that  the  signal   is  reduced  in  amplitude  at   the  the  sampling  instant  of   the
peak.   The  peak-annihilation vector  c  is constructed  by frequency-domain inputs  on a few tones  in set
1 = i
1
, i
2
, ..., i
r
, ..., N  i
r
, ..., N  i
1
.   Typically  r  <<  N/2.   The determination of this set  is deferred
to the end of this subsection.
Thus,
x + c = Q
 [X +C]   (4.362)
and
C
n
 =
_
  0   n , 1
mboxselected   n  1
  .   (4.363)
35
Conversion devices basically provide 6 dB of dynamic range increase for each additional bit used to quantize quantities.
393
Figure 4.41:  Probability of peak above specied PAR level for  N  = 128 OFDM/DMT systems with 5%
and 20% of tones reserved.   The extra FFT curve can be ignored and represents  a double-pass of the
transmitter  IDFT  where  the  phase  of  some  signals  has  been  selectively  altered  on  the  second  pass  to
reduce  PAR.
Thus,  X
n
 ,= 0 when  n , 1, leaving  N  r tones available for data transmission and the others reserved
for selection and optimization of the peak annihilator.
The problem can be formulated as
min
C
max
k
[x + Q
C[   (4.364)
with the  constraints  that  C
n
  = 0, n  1  and |x + Q
C|
2
  N
 
c
x
.   This  problem can  be  recognized
with some work as equivalent to a linear-programming problem for which algorithms, although complex,
exist for solution.  This subsection shows the performance of such exact solution to see potential benet,
but  rapidly  progress   to  an  easily  implemented  algorithm  that  approximates  optimum  and  is  used  in
practice.   Figure 4.42 illustrates the reduction in probability of a peak exceeding various PAR level.  For
reservation  of  5% of  the  tones,  a  6  dB  reduction  (savings  of  one  bit is  possible).   For  20% reservation
(perhaps an upper limit on acceptable bandwidth loss), a 10 dB reduction is possible.   Indeed, the PAR
achieved  for  this  OFDM/DMT  system  is  well   below  those  of   systems  that  perform  worse.   However,
the complexity of the solution used would be burdensome.   Thus, the alternative method of minimizing
squared clip noise is subsequently derived.
The alternative method makes use of the function
clip
A
(x
k
) =
_
  x
k
  [x
k
[ < A
Asgn(x
k
)   [x
k
[  A
  (4.365)
where  A > 0 and  x
k
  and  A are presumed real.   The clip noise for a symbol vector  x is then
2
x
 = |x  clip
A
(x)|
2
=
N1
k=0
[x
k
 clip
A
(x
k
)]
2
.   (4.366)
394
With the peak-annihilation in use
2
x+c
 = |x clip
A
(x = Q
C)|
2
.   (4.367)
The gradient of the clip-noise can be determined when a peak-annihilator is used:
2
x+c
 = 2 
k]xk+ck]>A&i1
sgn(x
k
 + c
k
)  [[x
k
 +c
k
[  A] Q
1
   q
k
  (4.368)
where Q
1
  is  the  N  2r  matrix of  columns  of Q
1
.   The  N  1 vectors
p
k
 = Q
1
 q
k
  (4.369)
can be precomputed as they are a function only of the IFFT matrix and and the indices of the reserved
tones.
36
These vectors are circular shifts of a single vector.   They will have a peak in the  k
th
position
and thus  changing the  sign of this  peak  and adding it to  the  original  x  to be  transmitted  reduces  the
PAR. Such an operation could occur  for each and every  k  such  that [x
k
 + c
k
[  >  A.   That is essentially
what  the  gradient  above  computes    a  direction  (and  magnitude)  for  reducing  the  clip  noise  in  those
dimensions where it occurs.
A  steepest-descent  gradient  algorithm for  estimating the  best  c  then  sums  a (small) scaled  sum  of
gradients successively  computed.   Dening
k
 = sgn(x
k
 + c
k
)  [[x
k
 + c
k
[  A] k  [x
k
 + c
k
[ > A   (4.370)
as a type of successive  error signal, then the gradient algorithm becomes
c
j+1
 = c
j
   
k]xk+ck]>A
k
(j)  p
k
  ,   (4.371)
where  j  is  an  algorithm index  reecting  how  many  iterations  or  gradients  have  been  evaluated.   This
algorithm converges to close to the same result as the optimum linear-programming solution.  Figure ??
shows the results of successive  iterations of this gradient algorithm.
After   approximately  10  iterations,   the   PAR  has   been  reduced  by  6  dB.   After   40  iterations,   the
algorithm  is   within  .5  dB  of   best   performance   with  5%  of   tones   used.   These   calculations   were  for
N  = 128 and represent  resulting from averaging over a variety of data vectors(and of course peaks only
occur for a small fraction of those random data vectors).
For most applications, the indices in 1 can be randomly chosen among those tones that have non-zero
energy  allocated  in  loading (use  vectors  that  have  zero  energy  at  the  channel  output  usually  leads  to
a peak  reproduced  at  the  output  of the  channel  even  if the  input  had  no such  peak,  which means  the
ADC will saturate  in the  receiver).   Tellado has  studied  methods  for optimizing the  set 1  but  came to
the conclusion that simple random selection the indices of to get 5% of the used tones is sucient.   (And
if the resultant vector to be circularly shifted does not have a peak in it, a second random selection is
then made.)
This   method,   largely  due  to  Tellado,   but   also  studied  at   length  by  Gatherer   and  Polley,   is   the
predominant method used for PAR reduction in practice today.  Its lack of need for coordination between
transmitter  and  receiver  (the  transmitter  simply  forces  the  receiver   to  o-load  the  tones  it  wants  by
sending  garbage data  on those  tones  that  forces  swaps  away from them).   Thus,  there  is  no need  to
specify the method in a DMT standard.   However, in OFDM standards the set of used tones  may need
agreement before hand.   Indeed since peaks do not occur often, the tones used for pilots in OFDM may
also carry the very infrequent peak-reduction signals without loss of performance of the (well-designed)
receiver PLLs.
36
These  terms  arise  in  the  gradient  vector  because  of   the  restriction  to  zero  of   the  non-reserved  tones  and  would  be
deleted from  the  gradient altogether if  there were  no  restrictions on C  or c.
395
Figure  4.42:   Probability  of   peak  above  specied  PAR  level   for   successive   iterations   of   the  gradient
algorithm.
4.10.3   Tone  Injection
5% or  more tone reservation  for PAR is considerably in excess  of the  minimums required  by theory  to
reduce  PAR substantially.   Tellado, and  independently  Kschichang,  Narula, and  Eyuboglu of Motorla,
found means of adding peak-annihilation signals to the existing data-carrying tones.
In  tone  injection,  no tones  are  reserved  for only peak-reduction  signals.   Instead,  signals are  added
to the data symbols already present  on some tones in a way that is transparent to the receiver  decision
process.   The  basic  concept  is  illustrated  in Figure  4.43.   There  a selected  constellation  point called  A
can  be  replaced  by  any one of 8  equivalent points  if the  16QAM constellation shown  were  repeated  in
all directions.   Each  of these  8 points has greater  energy  and increases  the transmitted  energy  for that
particular tone  if one  of these  is selected.   The basic  idea of tone injection is  to search  over  points like
the  8  shown  for  several  tones  to  nd  an  added  signal that  will reduce  the  peak  transmitted.   Because
peaks do not occur  often and because  the extra energy only occurs  on a few tones, the penalty of tone
injection is a very small increase in transmit power, typically .1 dB or less on average.  A knowledgeable
receiver design would simply decode the original point or any of the 8 possible replacement points as the
same bit combination.  The oset in either horizontal or vertical direction is
  
d units.
Thus,   peaks  in  time  domain  are  traded  for  peaks  in  the  frequency  domain  where  they  have  little
aect   on  the  transmitted  signal   energy,   and  in  particular  a  very  large  power-reducing  eect   on  the
consumed  power  of the  analog driver  circuits  that  often have  consumed  power  dependent  on the  peak
signal transmitted.
More formally, the time domain DMT or OFDM signal (avoiding DC and Nyquist, which are simple
extensions if desired and letting  R
n
= 'X
n
| and  I
n
= X
n
|) is
x(t) =
  2
N/21
n=1
R
n
 cos(
  2
NT
t
nt) I
n
 sin(
  2
NT
t
nt)   (4.372)
396
Figure 4.43:  Illustration of constellation equivalent points for 16QAM and Tone Injection.
397
Figure 4.44:  Illustration of successive  iterations of tone injection.
over  one  symbol   period  ingnoring  the  cyclic  prex.   x(t)  may  be  sampled  at  sampling  period  T
t
  =
T/(N + ) or any other higher sampling rate, which is denoted  x
k
  =  x(kT
tt
) here.   Over  a selected  set
of tones  n  1,  x
k
  can be augmented by injecting the peak-annihilation signal
 x
k
 = x
k
 
 2 
  
d
N
cos(
  2
NT
t
nkT
tt
) 
 2 
  
d
N
sin(
  2
NT
t
nkT
tt
)  n  1   .   (4.373)
It is not necessary  often to add both the cos and sin terms, as one is often sucient to reduce the peak
dramatically.   Algorithms for  peak  injection then  try  to  nd  a  small set  of   n  indices  at  any particular
symbol   for  which  an  injected  peak  signal   is   opposite  in  sign  and  rougly  equal   in  magnitude  to  the
largest  peak  found in the  time-domain symbol.   An  exhaustive  search  would be  prohibitively complex.
Instead, typical algorithms look for indices  n that correspond  to large points on the outer ranges of the
constellation used  on  tone  n
0
  having been  selected  by the  encoder  and  for which also the  cos  or  sin is
near maximum at the given peak location  k
0
.   Then, roughly,
[ x
k0
[ = [x
k0
[ 2
d/
N  = [x
k0
[ 
  2
 6  2
2
bn
0
2
2
bn
0
  1
  .   (4.374)
On each such iteration of the algorithm, a peak is reduced at location k
i
, i = 0, ... until no more reduction
is possible or a certain PAR level has been achieved.   Figure 4.44 illustrates successive steps of such tone
reduction  applied  to  an  OFDM  system  with  16  QAM  on  each  tone.   Note  that  about  6  dB  of   PAR
reduction is obtained if enough iterations are used.   The transmit power increase was about .1 dB.
398
4.10.4   Nonlinear  Compensation  of  PAR
Tellado also studied the use of receivers that essentially store the knowledge of nonlinear mapping caused
by peaks that exceed the linear range of the channel in a look-up table.   He then proceeded to implement
a  maximum likelihood  receiver  that  searches  for  most  likely  transmitted  signal.   This  method  is  very
complex, but has the advantage of no bandwidth nor power loss, and is proprietary to recievers  so need
not be standardized as would need to be the case for at least the silenced tones in tone reservation and
the modulo-reception-rules for tone injection.
399
Exercises  -  Chapter 4
4.1  Short Questions
a.   (1 pt) Why is the  X
0
  subsymbol in Figure 4.2 chosen to be real?
b.   (1 pt) Why is the  X  
N
  subsymbol in Figure 4.2 chosen to be one-dimensional?
c.   (3 pts) Given ideal sinc transmit lters,  
n
 = (1/
2
  = 10dB, design target  arg.   of Q-function =  9dB,
  
E
x
 = 1,  N  = 2
 
N  = 8.   As  in the
example, the last sub-channel is silenced.   Also, attempt a design with N  = 2
 
N  = 16.  The bit rate
of your new design will go down if correctly  designed.   Why?
4.3  GAP Analysis
a.   (1 pt) While the SNR gap may be xed for often-used constellations, this gap is a function of two
other parameters.   What are these two parameters?
b.   (2 pts) Based on your answer to part (a), how can the gap be reduced?
c.   (2 pts) For an AWGN channel with SNR = 25dB and QAM transmission,
  
P
e
 = 10
6
, and  b = 4.
What is the margin for this transmission system?
d.   (2 pts) Let  b = 6 for part (c), how much coding gain is required to ensure
  
P
e
 = 10
6
?
4.4  Geometric SNR
a.   (1  pt)  For  the  channel   in  Problem  4.2,   calculate  the  SNR
m,u
  using  the  exact  formula  and  the
approximate one, SNR
geo
  (for a gap, use  = 8.8dB).
b.   (1 pt) Are these  two values closer than for the channel 1 + .9D
1
?  Explain why.
c.   (3  pts)   Compare  the  dierence   of   the   SNR
MFB
  to  the  exact   SNR
m,u
  for  the  1 +  .5D
1
and
1 +.9D
1
.   Which one is closer?  Why?
4.5  Water-Filling Optimization
Repeat Problem 4.2 using water-lling optimization. Given spec is H(f) = 1 +.5e
j2f
, SNR
MFB
  =
10dB,  = 1(0dB). (Note that all the subchannels may be used in this problem)
a.   (2  pts)  Calculate  the  optimal distribution  of  energy  for  the  sub-channels  and  the  maximum bit
rate assuming that the gap,  = 1(0dB).
b.   (1  pt)  Calculate  the  gap for  PAM/QAM  that  produces  an  arg.   of  the  Q-function  equal  to  9dB.
(The gap for
 
b  1 is the dierence  between  the SNR derived from capacity and the argument of
the  Q-function for a particular probability of error)
c.   (2  pts)  Calculate  the  optimal distribution  of  energy  for  the  sub-channels  and  the  maximum bit
rate using the gap found in (b).
4.6  Margin Maximization
For the channel with H(f) = 1+0.5e
j2f
, as in Problem 4.2 maximize the margin using water-lling.
(All the subchannels may be used in this problem).   Let N=8 for this problem.
a.   (2 pts) Is transmission of uncoded  QAM/PAM at  P
e
 = 10
7
at data rate  1 possible?  (i.e.   is the
margin  0)?  If so, what will the margin be?
400
b.   (1 pt) For data rate 1, what gap provides zero margin?  For uncoded  PAM/QAM, what
  
P
e
  corre-
sponds to zero margin?
c.   (2 pts) What is the margin if  = 1(0dB)?
d.   (1 pt) What is the margin if  = margin in part (c) (dB)?
4.7  Creating a DMT Water-lling Program
Attach the appropriate program for this problem.   On plots versus  N, you may elect to plot only for
even  N  values in this program.   The sampling period is presumed to be  T
t
 = 1.
a.   (4 pts) Write a program to do DMT-RA water-lling on an arbitrary FIR channel with the pulse
response given as a row vector (ex.   h = [1111] for 1+D+D
2
D
3
or 1+D
1
+D
2
D
3
channel).
 will be determined as the length of your FIR channel-input vector.   The input parameters should
be
  A0
2
  ,
  
c
x
, the number of subchannels and the gap, .   (Hint, try using the ow charts in Figures
(4.7) - (4.9)).  What are
 
b and the energy distributions for RA loading on the 1 + .9D
1
channel?
(N = 8,  = 1 (0dB),
  
E
x
 = 1 and
  A0
2
  = .181)?
b.   (2 pts) plot
 
b vs.   N for the channel 1 + D D
2
 D
3
.   ( = 1 (0dB),
  
E
x
 = 1 and
  A0
2
  = .1)
c.   (2  pts)  Modify  your  program  (i.e.   write  another  program)  to  do  MA  water-lling  on  the  same
FIR channel.   What is  the  margin and  the  energy  distribution for  MA loading on  the  1 + .9D
1
channel?  (N = 8,  = 1 (0dB),
 
b = 1,
  
c
x
 = 1 and
  A0
2
  = .181)?
d.   (2 pts) plot margin vs.   N for the channel 1 +DD
2
D
3
with
 
b = 1.  ( = 1 (0dB),
  
c
x
 = 1 and
A0
2
  = .1)
4.8  Levin-Campello Algorithms
Please show work.   This problem executes  the Levin-Campello loading algorithm for the 1 + 0.5D
1
channel with PAM/QAM,
  
Pe = 10
6
,  SNR
MFB
 = 10dB, N=8, and
  
E
x
 = 1.   Use   = 1.)
a.   (2 pts)  Create a table of  e(n) vs.   channel.   (You may truncate  the table sucient to answer  part
(b) below).
b.   (3 pts) Use the EF algorithm to make
 
b=1.
c.   (1 pt) Use the ET algorithm to nd the largest
 
b.
d.   (1 pt) If the design were to reduce the  b by 2 bits from part (c), use the EF and BT algorithms to
maximize the margin.  What is this maximum margin?
4.9  Creating a DMT Levin-Campello Program
Attach the appropriate program for this problem.   On plots versus  N, you may elect to plot only for
even  N  values in this program.   The sampling period is presumed to be  T
t
 = 1.
a.   (4 pts) Write a programs to do RA LC (EF and ET) on an arbitrary FIR channel with the pulse
response  given as a row vector (ex.   h = [111  1] for 1 + D + D
2
D
3
or 1 + D
1
+ D
2
 D
3
channel).   The  input  parameters  should  be
  A0
2
  ,
  
c
x
,  the  number  of  subchannels  and  the  gap,   .
What are
 
b and the energy distributions for RA loading on the 1 + .9D
1
channel?  (N = 8,  =
1 (0dB),
  
E
x
 = 1 and
  A0
2
  = .181)?
b.   (2 pts) plot
 
b vs.   N for the channel 1 + D D
2
 D
3
.   ( = 1 (0dB),
  
E
x
 = 1 and
  A0
2
  = .1)
c.   (2 pts) Modify your program (i.e.  write another program) to do MA LC (EF and BT) on the same
FIR channel.   What is  the  margin and  the  energy  distribution for  MA loading on  the  1 + .9D
1
channel?  (N = 8,  = 1 (0dB),
 
b = 1,
  
c
x
 = 1 and
  A0
2
  = .181)?
d.   (2 pts) plot margin vs.   N for the channel 1 +DD
2
D
3
with
 
b = 1.  ( = 1 (0dB),
  
c
x
 = 1 and
A0
2
  = .1)
401
4.10  Sensitivity Analysis
This problem looks at the sensitivity of  E
n
  and  P
e,n
  (P
e
  for each sub-channel), to variations in the
values of  g
n
 for water-lling.
a.   (1  pt)   S
Eij
  =
  Ei/Ei
gj/gj
.   Show  that   S
Eij
  =
  (ij 1/N
)
gj Ei
,   where   N
)/gMAX
N
  
E/N
+(1/gharm1/gMAX)
  S
Eij
 
  (ij1/N
)/gMIN
N
  
E/N
+(1/gharm1/gMIN)
where  g
harm
  is the harmonic mean of gs.
(Hint:   Replace c
i
 by its closed form expression and manipulate the expression.)
d.   (1  pt)  Show  that   A
n
/g
n
   A
n
/g
n
,   where   A
n
  is  the  argument  of  the  Q-Function  for  the   n
th
sub-channel assuming that the bit-rate  b
n
 is held constant.   Interpret  the result.
(Hint.   Work with closed form expression for  g
n
E
n
 instead.)
e.   (2 pts)  Find the sub-channel  which has the worst  sensitivity in terms  of  P
e,n
 for 1 + 0.9D
1
and
1 + 0.5D
1
channels.   Use   = 3dB,  N=16,  SNR
MFB
=10dB for  multitone.(Assume  A
n
/g
n
  =
A
n
/(g
n
)).
4.11  Complexity of Loading
This problem uses 100 parallel channels(N=100) and evaluates dierent algorithm complexities:
To compute a water-lling energy/bit distribution, we must solve (??) with the additional constraint of
positive energies.   An operation is dened as either an add, multiply, or logarithm. The word complexity
denotes the number of operations.
For  parts  (a)  and (b),  this  problem is  primarily interested  in  the  complexity as  an order  of   N  and
N
,  (i.e.   is  it  O(N
2
)?)   Also,  comparison  of part  (a)  with part  (b)  provides  an idea  of how  much less
complex Chows algorithm is, so detailed complexities are also of interests, although an exact number is
not necessary.   Please  use  a set  of reasonable  assumptions (such  as  sorting takes  O(N log N)).   Chows
algorithm here  refers  to  Chows  on/o  primer.   For  (c),  assume  that  we  start  with  b =  [000...0], then
do E-tight.   The complexity would depend  on the total number of  bs, and  .   So, you can express  your
answer  in terms  of  b and  .   It  is dicult to compare (c)  with (a)  or (b),  as they  really solve dierent
problems.   Nevertheless,  some reasonable estimates would be helpful.
a.   (2  pts)  Formulate  an  ecient  algorithm that  computes  the  water-lling  energy/bit  distribution.
Compute the number of operations for water lling.  How does this complexity depend on  N
?
b.   (2 pts) What is the complexity for Chows algorithm?
c.   (2 pts) What is the complexity for LC algorithm?
d.   (1 pt) Compare the dierent complexities calculated in (a),(b),(c).
4.12  Dynamic Loading
An ISI-channel with AWGN has a multitone transmission system that uses MA LC loading changes
from 1 + D  D
2
 D
3
to 1 + D  .1D
2
 .1D
3
.   The  energy  per  dimension is
  
c
x
  = 2 and
 
b = 1 with
A0
2
  = .1.  Use   = 1 and a gap corresponding to uncoded PAM/QAM with  P
e
 = 10
6
.
a.   (2 pts) Find the number of bit-swaps that occur because of this change when N  = 8 and N  = 12.
(You may assume  innite-length basis  functions and that the  SNR is approximated by using the
fourier transform value in the center  of each band as in the examples in Sections 4.1-4.3.)
402
b.   (4 pts)  Find the  information that is  transmitted  from the  receiver  to  the  transmitter  for each  of
the swaps in part a.   Specically, nd the tone indices and the gain-factor changes associated with
the add and delete bit positions.
4.13  VC for the 1 + 0.5D
1
For the 1 + .5D
1
channel,  N  = 8,
  
E = 1,
  A0
2
  = .125, T
t
 = 1, and  = 3dB
a.   (2 pts) Solve for the VC subchannels,  g
n
.
b.   (2 pts) Find the optimal RA bit Water-lling distribution.   What is the bit rate?
c.   (2 pts) Find the optimal MA bit Water-lling distribution for
 
b = 1.25.  What is the margin?
4.14  DMT for the 1 + 0.5D
1
For the 1 + .5D
1
channel,  N  = 8,
  
E = 1,
  A0
2
  = .125, T
t
 = 1, and  = 3dB
a.   (2 pts)  Solve for the  DMT gains,  P
n
, and the  optimal water-lling RA bit distribution.   What is
the bit rate?
b.   (2 pts) Compare the  gains  P
n
  to the  gains  
n
  of vector coding for the same  channel.   Which one
is better in terms of the ratio of largest to smallest bit assigned?  Why is that?
c.   (3 pts) Plot the Fourier spectra (i.e.  the t with appropriate zero-padding) of the columns of M for
VC and for DMT in case of N=8,16,32.  What is the change of pattern in spectra as  N  increases?
4.15  Loading - 15 pts - Midterm 2000
A DMT design is used on the channel with discrete-time response  P(D) = 1D
2
with additive white
Gaussian noise that has psd (discrete-time variance) 
2
= .181 and the transmit energy per dimension is
c
x
 = 1.  Loading will be executed on this channel for N  = 16 and  = 2.  (Hint think carefully if anything
about this channel looks familiar and might save you considerable time in executing the computations -
for instance, what do you know about even and odd sample times at the output?)
a.   Sketch  the  magnitude  of  the  Fourier  Transform  of  the  channel, [P(e
T
)[  and  nd  the  channel
SNRs,  g
n
, at all frequencies of interest to DMT loading.  (3 pts)
b.   Execute waterll rate-adaptive loading for this channel with a 0 dB gap.   (3 pts)
c.   Round the  distribution  that  you obtained  in part  b  so  that  the  granularity is    = 1 bit  per  two
dimensions  on  complex  subchannels  and  one-bit  per  dimension  on  one-dimensional  subchannels
and nd the new data rate.   (1 pt)
d.   Now use the LC algorithm to compute a RA data rate for  = 1.  Be sure to show your incremental
energy table.   (4 pts)
e.   Find a MA bit distribution with  = 1 for transmission with 8.8 dB gap at a data rate corresponding
to
 
b = 1, and the corresponding margin.  (4 pts)
4.16  Generalized Nqyuist
a.   (2 pts) For what  N  1 does multitone transmission satisfy the generalized Nyquist criterion and
symbol rate 1/T  on any linear ISI additive white Gaussian noise channel?
b.   (2 pts) What could the receiver do to attempt to correct the situations (that is, the  N  values) for
which the GNC in part (a) is not satised.
c.   (2 pts) How does the  N  necessary  for a channel depend on the channel characteristics?
4.17  Periodic Channels
The simplest possible case of nding eigenfunctions is for the periodic channel; for which this problem
seeks   to  prove  various  interesting  facts.   These   facts   will   be  useful   for   the  investigation  of   Discrete
Multitone Transmission.   r(t) is the channel autocorrelation function
403
a.   (2 pts) Suppose  r(t) =  r(t + T), show  
n
(t) =  e
j
2
T
  nt
are eigenfunctions of the  channel autocor-
relation function.   (Hint:   express  r(t)  in a Fourier series.)   By  eigenfunction, this  problem means
that if  (t),  t  [0, T), is used for a channel with periodic  r(t), the output  y(t),  t  [0, T), will be
a scaled version of  (t).   The scaling factor is the eigenvalue.   Note that here,  both the input and
output signals are restricted  to the interval [0, T).
b.   (1 pt) What are the associated eigenvalues?
c.   (2 pts) If the  channel were  not periodic, but  had nite length (), how could the  designer create
a new set of functions
  
n
(t) from  
n
(t) such  that part of the output signal looks periodic?  (hint:
think of cyclic prex in DMT.)
Parts (a) and (b) show the eigenfunctions are independent of the channel, which is a very desirable
property.  However, this only happens with periodic channel.   The question is, for r(t) non-periodic,
can the designer modify the input so that the output appears is as if the channel is periodic.  More
precisely,  let  r
r
(t)  denote  the  autocorrelation of a periodic  channel,   r(t)  denote  the  non-periodic
version (i.e.   r(t) = r
r
(t) for  t  [0, T), r(t) = 0 outside of [0, T)).  The relation  (t)  r
r
(t) = k(t)
is  true,  where  k  is  the  eigenvalue.   Can  the  designer  modify  (t)  so  that    (t)  r(t)  =  k(t),  for
t  [0, T)?  If so, the transmission system can then use the channel-independent eigenfunction.   By
part of output signal looks periodic, we mean the output signal looks as if it were created by a
periodic channel.  Hint:   think guard period and extensions.
d.   (1 pt) If the receiver only looks at the periodic part of the output signal, what would the associated
eigenvalues be?
e.   If a transmission  system  uses  the  channel-independent  eigenfunctions  developed  in this  problem,
and thus provides the appearance of a periodic channel, how much more bandwidth does the design
have to use, or in other words, how much faster does the sampling rate have to be?
4.18  Vector Coding
This problem concerns channel partitioning and Vector Coding.
a.   (2 pts) Consider the following vector channel :
The  system  uses  a  matrix transmit  lter,   M,  and  a matrix receiver  lter,   F
, such that FF
  =
MM
 and 
t
s.   Assume
  A0
2
  =  .125,
  
c
x
 = 1, and
N  = 8.
b.   (3  pts)  Using  the  value of s,  calculate  the  optimal energy  and  bit  distribution using  a  loading
algorithm with  = 3dB.
c.   (1 pt) What is the  SNR
vc
  for this vector-coded system?
404
4.19  Theoretical questions  on Vector Coding
a.   (1 pt) Show that  N  in vector coding is indeed white and Gaussian with variance  N
0
/2.
b.   (3 pts) Calculate  R
yy
  and show that [R
yy
[ =
 
N
n=1
(
A0
2
  + c
n
[
n
[
2
).
c.   (2 pts) Verify 1 + SNR
vc
 =
_
]R
yy
]
]R
nn
]
_
1/N+
assuming that the gap is 0 dB.
d.   Prove that  F
 =
_
_
1 +
  1
SNR1
  0   0
1 +
  1
SNR2
  0
.   .
0   .
0   0   .   .   1 +
  1
SNRN
_
_
R
Xy
R
1
yy
4.20  Grasping Vector Coding  DMT
Assume  the  channel  is 1 + aD
1
.   This  problem varies  a and  N  to  determine  their  eects  on  F, 
and  M.
a.   (1 pt) For N=4 and a=0.9, compute F,  and M. Plot the modulus of the spectra of the eigenvectors
of M, that is, take  the  columns of M as  time sequences  and calculate  their  abs(t).   Use  at  least
32 points for the ts.   Turn in this plot.   What shape do these lters have?
b.   (2  pts)  Repeat  part  (a)  for  N  =  4  and  a  =  0.1.   Do  you  notice  any  dierences   in  the  resulting
lters ts compared to (a)?  What explains this behavior?
(Hint:   Part (c) might be helpful)
c.   (2  pts)   Set   a  =  0.9  and  N  =  8.   Calling  
i
  the  diagonal   elements   of   the    matrix,   calculate
0
,(
1
 +  
2
)/2,   (
3
 +  
4
)/2,   (
5
 +  
6
)/2  and  compare  to  the  values   of   H
n
  in  Example  4.1.1.
Explain the similarity.
d.   For all the examples tested, matrix F  seems to have some special structure.   Describe it.   How can
you use this to reduce the number of operations in the receiver?
4.21  Partitioning - 2002 Midterm
A wireless transmission system is shown in Figure 4.45.  Uncoded QAM with gap 8.8 dB at P
e
 = 10
6
is  used  on  each  of  3  transmit  antennae  with  the  same  carrier  frequency  but  independent  input  data
streams.   Each  has  symbol rate  100  kHz  and  the  energy  per  dimension at  that  rate  is  unity (c
x
  =  1).
There  are  also  3  receive  antennae.   All  9  paths  shown  are  AWGN  channels  with  independent  noise  of
two-sided psd .01 on each.   The gains are as shown.   The other signals may be considered as noise for
each of the 3 main signal paths from xmit antenna  n to receive antenna  n or all 3 signals may be jointly
transmitted and received.   This problem investigates the dierence  in approaches.
405
Figure 4.45:  Wireless MIMO channel.
a.   (2 pts) If the other two signals are considered to be additive Gaussian noise, what is the possible
 = R
UU
a.   (5 pts) Find W  in terms of 
n
, 
n
, 
2
and F.  Look at your expression for W.  What is the equalizer
W  doing  to  each  sucbchannel?   Why  is  this  the  same  as  minimizing  MSE  on  each  subchannel
individually?
b.   (1 pt) Show how the VC receiver  W  could be implemented adaptively, without explicit knowledge
of  P  or  F
2
= .5, N=10, I(D) = .3D
4
+.1D
5
, W(D) = 1+.9D.   Using t can make this very simple.  Please
present your values.
g.   (3 pt) Now, change your DMT program so that it takes in the (sampled) error spectrum which we
can get from the program in (f) as an argument.   The DMT program will now do waterlling Find
b for an
  
E
x
 = 1, N=10, P(D)=1+.5D, and  S
ee
(e
jw
)[ 2pik
N
= [P[k][
2
, where  S
ee
(e
jw
) is the  power
spectrum of the error sequence and P[k] is the t of the pulse response(i.e, what is
 
b if the PSD of
error is the same as the PSD of pulse response  (=P(e
jw
)[ 2pik
N
)?  ).
h.   (2 pt) Putting (f) and (g) together:
For  the  following parameters,   what  is  the
  
b  in  the  DMT  system?   Choose  the  same  setup  as  in
example 10.6.1.   Now, with  = 0, let L=4,   = 3.   Sweep  N until we reach asymptotic behavior.
Plot
 
b vs.   N. Also, do the  same with the  unequalized  channel,  remembering to  include the  error
(i.e, we do not have TEQ, but still have the same    for cyclic prex!).
i.   (1 pt.)  Do (h) with
  
E
x
=1000.   What does this tell you about the teq?
4.26  DMT Spectra
This  problem  concerns  the  spectrum  of a  DMT  signal.   The  transmit  signal will be  assumed  to  be
periodic with period  T.
a.   (2 pts) Show that the frequency spectrum of this transmit DMT signal is a sum of impulses.
b.   (3 pts) Show that multiplying the periodic transmit signal x(t) by a window function w(t) that has
the value 1 for 0  t  T  and is zero elsewhere  generates the non-periodic DMT signal studied in
Section 4.4.   Compute the spectrum  of this new signal from the spectrum  of the  original periodic
signal and the spectrum of the window.
408
c.   (1 pt) Compare the spectrum of part (b) for two situations where    > 0 and   = 0.
d.   (2 pts) Suppose  a duplex channel  has signals traveling in opposite directions  on the same media.
The designer  wishes to separate  the two directions by allocating dierent  frequency  bands to the
two DMT signals, and separating them by lowpass and highpass ltering.   Can you see  a problem
with such a duplexing approach?  If yes, what is the problem?
4.27  Isakssons Zipper
For the EPR6 channel (1 + D)
4
(1  D),
a.   (1 pt) Find the minimum length cyclic prex to avoid inter-tone interference  in DMT.
b.   (2 pts) Find the minimal length of the cyclic sux in terms of both channel length    and channel
end-to-end delay  in sampling periods to ensure that both ends have symbols aligned for both the
case of no timing advance and with timing advance.
c.   (2  pts)  Do  the  receiver  or  transmitter  need  any  lowpass,  bandpass,  or  highpass  analog lters  to
separate the two directions of transmission (assuming innite precision conversion devices)?  Why
or Why not?
d.   (2 pts)  Suppose  two dierent  systems  with at least  one  side  co-located (i.e., the  transmitter  and
receiver for one end of each of the channels are in the same box).  The two signals are added together
because the channel wires touch at the co-located end.   Could Zipper have an advantage in this
situation?  If so, what?
4.28  Kasturias Block DFE
Subsection  4.6.7 describes  the  so-called  Block DFE  for channel  partitioning.   This  problem investi-
gates  the  improvement  for  this  guardband-less  transmission  method  for  the  1 + .9D
1
channel  of  the
notes.   Use  all parameter  values  as  always for  this  channel,  except  that  we  will assume  this  channel  is
causal and 1 + .9D.
a.   (2  pts)  Find  the  8  8  matrix
  
P  for  N  =  8  with  the  BDFE,  and  its  associated  SVD  for  vector
coding.
b.   (2  pts)   Using  water-lling  for   loading,   what   is   the   SNR  for   this   system  of   parallel   channels?
Compare to VC and DMT.
c.   (2  pt)  Draw  a  picture  of   the  receiver   for  the  BDFE  in  this  case  and  indicate  where  the  extra
complexity enters.
d.   (2 pts) Kasturia found a way in 1988 to derive a Tomlinson-like precoder for this channel.   Explain
essential  features  of such  a  precoder  and  why  it might  be  dicult  to  implement.   Can  you  state
why the BDFE might be undesirable for partitioning?
4.29  TEQ Design for the 1/(1 + 0.5D) Channel
For  the  1/(1 + .5D)  channel,  with white  input
  
c
x
  = 1,  AWGN with
  A0
2
  =  .04, and  T
t
  = 1.   Force
B(D) to have squared norm 4/3 with a 3-tap TEQ.
a.   (4 pts) Find the best  B(D) with MMSE design for   = 1 and   = 2.
b.   (2 pts) Find the corresponding MMSEs in part a.
c.   (2 pts) Find the corresponding TEQs (W(D)) for part a.
d.   (2  pts)  Compute  the  unbiased  channel   for  each  W(D)  and  corresponding  distortion  energy  per
sample.
e.   (4 pts)  Using  = 3 dB, nd SNR
dmt
  for this equalized channel with  N  = 8.   Repeat for large  N
to  nd  the  limiting value  of the  SNR
dmt
  for  the    = 1  case.   Find  also then  the  largest  possible
margin for
 
b = 1.
409