Application: V.34 Transmitter and Receiver Implementation On The TMS320C50 DSP
Application: V.34 Transmitter and Receiver Implementation On The TMS320C50 DSP
This application report consists of one of the entries in a contest, The 1995 TI DSP Solutions
Challenge. The report was designed, prepared, and tested by university students who are not
employees of, or otherwise associated with, Texas Instruments. The user is solely responsible for
verifying this application prior to implementation or use in products or systems.
                                           SPRA159
                                           June 1997
Texas Instruments (TI) reserves the right to make changes to its products or to discontinue any
semiconductor product or service without notice, and advises its customers to obtain the latest
version of relevant information to verify, before placing orders, that the information being relied
on is current.
TI warrants performance of its semiconductor products and related software to the specifications
applicable at the time of sale in accordance with TI’s standard warranty. Testing and other quality
control techniques are utilized to the extent TI deems necessary to support this warranty.
Specific testing of all parameters of each device is not necessarily performed, except those
mandated by government requirements.
Certain applications using semiconductor products may involve potential risks of death,
personal injury, or severe property or environmental damage (“Critical Applications”).
Inclusion of TI products in such applications is understood to be fully at the risk of the customer.
Use of TI products in such applications requires the written approval of an appropriate TI officer.
Questions concerning potential risk applications should be directed to TI through a local SC
sales office.
In order to minimize risks associated with the customer’s applications, adequate design and
operating safeguards should be provided by the customer to minimize inherent or procedural
hazards.
                                                            List of Figures
1         Block Diagram of the Transmitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2         Mapping Frame Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3         Scrambler Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4         240-Point Quarter Superconstellation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5         16-State Convolutional Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
6         Block Diagram of Receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
7         Symbol Clock Recovery Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
8         Equalizer Adaptation-Loop Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
9         Fast Equalizer Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
VIII–1    Constellation Shaping With Shell Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-3
VIII–2    Plot of Function in Summation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-15
VIII–3    Concentric Shaping Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-22
IX–3      The 2D Partition Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-2
iv         SPRA159
  V.34 Transmitter and Receiver Implementation on the
                    TMS320C50 DSP
                                            ABSTRACT
     This application report presents the design of an efficient V.34 transmitter and
     receiver pair. The algorithms behind the advanced encoding and decoding schemes
     of the V.34 recommendation are described, and the assembly language functions
     that implement these algorithms are referenced. The entire assembly language
     source code of the project is provided with full documentation of the details of the
     implementation. C source code from V.34 modem simulations is also provided. It is
     found that the TMS320C50 digital signal processor (DSP) is exceptionally
     well-suited to the task of encoding and decoding V.34 modem signals.
1 Introduction
     The      International   Telecommunications         Union    (ITU)   modem
     recommendation V.34 represents the state of the art in modern modem
     design [1]. Through the use of advanced coding techniques, modems
     conforming to the V.34 standard can achieve data communication rates
     previously thought unattainable on standard general-switched telephone
     networks (GSTNs). Such modems are rapidly becoming the method of
     choice for transmitting data quickly and reliably over standard telephone
     lines. Because of this popularity, the practical design and implementation of
     a V.34 class modem using conventional digital signal processing hardware
     is of prime importance.
     This project submission details the implementation of a V.34
     transmitter-and-receiver pair using standard TMS320C50 evaluation
     modules (EVMs). The project was designed using two unmodified EVMs on
     identical Hewlett Packard Vectra 486 PCs. The project consists of two
     components: the V.34 transmitter and the V.34 receiver.
     The transmitter has two parts: an EVM-compatible object file called tran.out
     and a PC-based front-end called transmit.exe.To run the transmitter, one
     must first load and run the object file on the EVM, and then run the PC
     executable file on the host PC.
     The receiver has two analogous parts: recv.out and receive.exe. The
     procedure for running them is the same as for the transmitter. To
     demonstrate their operation, the output of the EVM running the transmitter
     should be connected to the input of the EVM running the receiver.
          The programs for both the transmitter and receiver are written entirely in
          TMS320C50 assembly language. They implement a large subset of the ITU
          recommendation V.34. The transmitter operates at 9.6-, 14.4-, and
          19.2-kbps data rates only. However, this speed limitation is imposed only by
          the choice of a 2400-baud symbol rate. The routines developed in this
          project are fully general and can be used to transmit at up to the full speed
          of the V.34 specification (if the maximum 3200-baud symbol rate is utilized).
          The PC front-end programs are written in C and provide a rudimentary user
          interface to the DSP programs. In particular, they display the transmitted and
          received point constellations on the PC graphics screen. These displays are
          especially useful for debugging purposes, but they also illustrate the
          operation of the transmitter-receiver pair nicely. Data typed at the terminal
          also can be transmitted over the data channel once a connection has been
          made.
          The bulk of this paper documents the algorithms used in the two programs.
          The assembly language routines that implement the particular algorithms
          are referenced throughout the text. The source files contain explanations of
          the ’C50 implementation particulars. Therefore, these source files should be
          examined in addition to the text of the project submission.
2         SPRA159
                                                                                                                Introduction
01010101
 Data Stream
{Qi,j,k,1,Qi,j,k,2,Qi,j,k,3, , Qi,j,k,q}
                                                                                                      Trellis
                                                                                                     Encoder
Modulate Precode
2 V.34 Transmitter
          The transmitter consists of four logical units that correspond to the stages
          through which the data flows as it is encoded for transmission. A block
          diagram of the basic V.34 transmitter is shown in Figure 1. The first of these
          units, called parse, accepts a stream of binary input data, scrambles it, and
          then partitions these scrambled bits into different groups to be passed to the
          next unit. The second logical unit, point-select, uses the parsed bits to select
          signal points from a constellation of 2-dimensional (2D) points that has been
          specified for use in V.34. The third logical unit, precode, applies a precoding
          filter to the signal points to compensate for the noise-whitening filter present
          in the V.34 receiver. This unit also contains the trellis encoder, connected in
          a feedback configuration, which ensures that the transmitted points
          correspond to a proper trellis sequence. The final unit, modulate, performs
          standard quadrature amplitude modulation (QAM) of the signal points to
          construct the final output waveform.
4         SPRA159
                                                                            V.34 Transmitter
Output Stream
16D: j = 0 1 2 3
4D: k = 0 1
2D Symbols
2.2    Parse
             The parse logical unit serves two major purposes in the transmitter. First, it
             scrambles the input data to eliminate long, repetitive sequences of bits.
             Such sequences could cause a periodic output signal, eliciting loss of
             symbol clock tracking, or unstable behavior in the adaptive filters of the V.34
             receiver. Second, it groups the scrambled bits into discrete packets for
             further processing by the point-select stage of the transmitter.
Input x(n)
              +         D1           D2               D18            D19              D23
Output y(n)
6            SPRA159
                                                                                                  V.34 Transmitter
        (Si,1,Si,2,...,Si,K)
        (I1i,0,I2i,0,I3i,0),(Qi,0,0,1,Qi,0,0,2,...,Qi,0,0,q),(Qi,0,1,1,Qi,0,1,2,...,Qi,0,1,q),
        (I1i,1,I2i,1,I3i,1),(Qi,1,0,1,Qi,1,0,2,...,Qi(,1,0,q),(Qi,1,1,1,Qi,1,1,2,...,Qi,1,1,q),
        (I1i,2,I2i,2,I3i,2),(Qi,2,0,1,Qi,2,0,2,...,Qi,2,0,q),(Qi,2,1,1,Qi,2,1,2,...,Qi,2,1,q),
        (I1i,3,I2i,3,I3i,3),(Qi,3,0,1,Qi,3,0,2,...,Qi,3,0,q),(Qi,3,1,1,Qi,3,1,2,...,Qi,3,1,q).
        The parsing operation is carried out by the parsedata function found in the
        file parse.asm. This function of the transmitter differs for each set of
        constants K and q. Therefore, the actual parsing of data is carried out by one
        of three sub-functions: parsedata96, parsedata144, or parsedata192. The
        choice of which parsing function to use is determined at the beginning of the
        execution of the program and is dynamically executed by using the bacc
        assembly language instruction.
2.3   Point-Select
        The unique method of 2D point selection used by the V.34 transmitter is one
        of its main advancements over previous generations of modems. It is
        because of these advancements that data-transmission rates twice those
        of older modems are now achievable.
        The V.34 transmitter selects 2D points from a subset of the 2Z2+(1,1) lattice.
        This subset is called the superconstellation, and it consists of 960 total
        points. A 240-point subset of the superconstellation, on the 4Z2+(1,1) lattice,
        is shown in Figure 4. The full superconstellation can be obtained from this
        subset by rotating it by 0_, 90_, 180_, and 270_ degrees. Each combination
        of symbol rate and data rate uses a particular subset of the
        superconstellation; rarely is the full superconstellation utilized. In this report,
        these subsets of the superconstellation are referred to as transmission
        constellations.
                                                 234   206   185   173   164   162   170   181   197   220
                   29                                                                                                                29
                                           226   193   165   146   133   123   121   125   137   154   179   207
                   25                                                                                                                25
                                     229   189   156   131   110   96    87    83    92    100   117   140   172   208
                   21                                                                                                                21
                                           205   176   150   130   114   107   105   109   120   136   161   191   227
                   -23                                                                                                               -23
                                                 215   184   169   153   145   143   151   159   178   202   231
                   -27                                                                                                               -27
                                                                               239
                   -35                                                                                                               -35
8         SPRA159
                                                                                   V.34 Transmitter
      The V.34 point-selection scheme has three parts that correspond to the
      three sets of bits generated in the parse unit. First, the shell mapper uses
      the K S bits to generate a sequence of eight rings. Second, the eight groups
      of q uncoded Q bits are used to select a point within each ring from the
      one-quarter superconstellation (see Figure 4). Third, the four groups of I bits
      (along with output bits U0 from the trellis encoder) are used by the differential
      encoder to rotate the four pairs of 2D points. The 90_ rotations applied by
      the differential encoder result in points from the full superconstellation.
          The easiest and fastest way to implement shell mapping is by table lookup.
          However, the memory needed to store an 8 × 228-word lookup table far
          exceeds that of most modem systems (as well as the 64K of SRAM on the
          ’C50 evaluation module that was used for this project). In light of this fact,
          a divide-and-conquer approach, which combines smaller lookup tables with
          some computational effort, is used. The ITU recommendation V.34 provides
          the actual algorithm that was used in this implementation of a transmitter
          (see Appendix A). A very brief description of how the algorithm works
          follows.
          Assume that the input to the shell mapper is a K-bit number R0. Three lookup
          tables are used by the algorithm. The table g2(p) is the number of ring
          sequences of length two with weight p. Likewise, table g4(p) is the number
          of ring sequences of length four with weight p. Finally, the table z8(p) is the
          number of ring sequences of length eight with weight less than p. The first
          step in the algorithm is to determine the value p such that z8(p) < R0 <
          z8(p+1). This value of p is the weight of the sequence of eight rings
          corresponding to R0, and the difference R0–z8(p) is its offset into the shell
          of equal-weight sequences. Using the value of the offset, the algorithm then
          uses the g4 table to determine the weights and offsets of the two
          sub-sequences of length four, which make up the final sequence. In a similar
          fashion, the g2 table is used to find the weights of the four sub-sequences
          of length two that constitute the final output sequence. Given the weight of
          a length-two sequence, the two rings of the sequence can be determined by
          a simple conditional statement. After finding the two rings in each of the four
          sub-sequences, the final sequence is found by concatenation. A more
          complete description of the algorithm is beyond the scope of this paper; refer
          to Appendix A for a more detailed explanation.
          The shell mapping operation is accomplished by the assembly language
          function called shell_map in the source file named shell.asm. It accepts an
          input in the accumulator and returns a sequence of eight shells in the array
          mjk. Although the shell mapping algorithm remains invariant for any number
          of shells, the lookup tables change with the number of shells. In this
          implementation, the 9.6-kbps data rate requires M=6 shells, while the 14.4-
          and 19.2-kbps data rates require M=12 shells. Two sets of tables are stored
          to accommodate both configurations. The function shell_map dynamically
          loads the addresses of all tables, so it can function correctly with any number
          of shells (provided that the proper tables are pre-computed).
10        SPRA159
                                                                                    V.34 Transmitter
2.3.2 Mapper
       Mapping the eight sets of uncoded Q bits to a point within a shell is a
       straightforward operation. The points in Figure 4 are each labeled with a
       number ranging from 0 to 239. They are numbered in order of increasing
       power, and these numbers are the point indices into a lookup table. To select
       a point within shell m with a given set of q Q bits = {Q1,Q2,...,Qq}, the following
       equation is used:
       In essence, the Q bits are taken as a binary number, and they are added to
       the index of the first point within the shell. This addition results in the index
       of a point within the quarter superconstellation. This point can then be
       retrieved from the lookup table for further processing by the differential
       encoder.
       Assume that the input to the differential encoder is {I3, I2, I1, U0}, and that the
       encoder has a pair of initial points (i.e., one 4D symbol) to encode. Then, by
       considering the bit pair (I3,I2) to be a 2-digit binary number, the differential
       encoder calculates the modulo-4 sum
                            Zm   + [(I , I ) ) Z
                                      3   2           m–1]   mod 4,
          where Zm±1 is the previously calculated value of Zm. The 2-bit number Zm
          can assume the values (0,1,2,3) and represents a rotation of Zm*90o
          clockwise. The first point of the 4D pair then is rotated by the amount
          specified by Zm. The second point in the 4D pair is rotated in a similar
          manner, with the rotation factor computed as
                            Wm   + [(Z ) (I , U )] mod 4,
                                          m       1     0
          where (I1,U0) is considered a 2-digit binary number. The second point then
          is rotated Wm*90o clockwise.
          The functions of the differential encoder are performed by three assembly
          language routines:
          • Compute_rotation_factorZ
          • Compute_rotation_factorW
          • Rotate90_cw
          These routines are found in the source file rotate.asm and provide
          straightforward implementations of the operations described previously.
2.4    Precode
          The precode unit consists of the combined nonlinear precoder and trellis
          encoder. The innovative combination of the two is another feature unique to
          the V.34 modem. By including the trellis encoder in a feedback loop (as
          shown in Figure 1), it is possible to compensate for the noise-whitening
          prediction-error filter of the V.34 receiver without destroying the shaping gain
          provided by the shell mapper.
12        SPRA159
                                                                                   V.34 Transmitter
       The V.34 transmitter contains a local replica of this filter, the inverse of which
       is used to precode the transmitted signal points. This precoding has the
       effect of pre-emphasizing the points in a certain way so that they will be
       properly decoded at the receiver after the noise-whitening process. The
       downside of this precoding is that it destroys the trellis sequence (see next
       section) that is transmitted. To remedy this situation, V.34 provides for a
       correction bit to be computed by the nonlinear precoder to preserve the
       validity of the trellis sequence. This correction bit, C0, is computed once for
       every 4D symbol. It is then exclusive-ORed with the output Y0 of the trellis
       encoder to form the bit U0, one of the input bits to the differential encoder.
       The functions of the nonlinear precoder are performed by a variety of short
       assembly routines in the transmitter. The filtering process is done by a pair
       of routines: precoder_filter_real and precoder_filter_imag. The strict
       rounding procedures dictated by the V.34 recommendation are carried out
       by two functions: round_precoder_output and quantize. The function
       quantize is actually a wrapper for three other functions, quantize96,
       quantize144, and quantize192, each of which executes slightly different
       rounding techniques for the different speeds of the transmitter. The
       computation of the correction bit is done by the routine compute_C0. These
       routines are found in the source files pre_filt.asm, quantize.asm, and
       conv_enc.asm.
          There are three 4D trellis encoders specified for use with V.34: a 16-state
          code, a 32-state code, and a 64-state code. In this transmitter, however, only
          the 16-state 4D code invented by L-F Wei is implemented. The heart of the
          trellis encoder is the 16-state convolutional encoder shown in Figure 5. In
          a typical configuration, the convolutional encoder’s two inputs would be
          taken from the differentially encoded I bits. However, when configured as in
          the V.34 transmitter, the inputs are computed from the output points after the
          nonlinear precoding has been applied. The algorithm for computing these
          inputs to the convolutional encoder from the precoded output points is given
          in Appendix A. This algorithm yields the inputs for all three types of encoders
          for V.34. Modifying the transmitter to use one of the other codes would
          require only inserting a new state table to implement the convolutional
          encoder.
                         Y2(m)               Y2(m)                 Y1(m)
                                                                                     Y0(m)
                   D2               D2                   D2                   D2
14        SPRA159
                                                                                 V.34 Transmitter
2.5   Modulate
       The QAM of the transmitter is done in an efficient manner. The
       digital-to-analog (D/A) converter on the EVM is configured to operate at
       9600 samples per second. With a symbol rate of 2400 symbols/sec, this
       results in four samples per symbol. A window of eight 2D symbols is stored
       at any one time, so the modulator uses a baseband-shaping filter
       32-samples wide whose output is modulated up to the passband around the
       carrier frequency of 1800 Hz. The shaping filter used is a raised cosine filter
       with excess bandwidth factor equal to 0.12. However, to save computation,
       two passband-shaping filters are used: one for the in-phase component, and
       the other for the quadrature component, both of which have been
       pre-modulated by the carrier frequency. The 2D symbols must be modulated
       into the passband before application of the filters. Because of the particular
       choice of symbol rate and carrier frequency (2400 symbols/second and
       1800 Hz), this modulation of the symbols simplifies to a rotation by a multiple
       of 90o. This operation is performed by a short-jump table routine.
       To save further computation, the passband-shaping filters are implemented
       as four banks of 8-tap interpolation filters. Instead of
       256 multiply-and-accumulate instructions per output symbol (32 taps ×
       4 samples × 2 channels), only 64 (8 taps × 4 samples × 2 channels) are
       required.
       The QAM in the transmitter is performed by two functions: QAM_mod and
       QAM_wait, located in the qam_mod.asm source file. QAM_mod performs
       the actual filter calculations using repeated mac and macd instructions,
       while QAM_wait simply pauses the execution of the program so that the
       interrupt-driven output procedure can “sync up” with the main transmitter
       routines. Since the ’C50 DSP is idle over 95% of the time that the transmitter
       is executing, the QAM_wait function is essential for synchronizing the
       program to the symbol rate.
Demodulate
     Analog
     Signal
                   Passband                 x(n)       Prediction     y( n)
                                 Demod
                   Equalizer                           Error Filter
Decode
                                                                                        Output
                                           y(n)                       u( n)              Bits
                                Viterbi                 Inverse               Inverse
                               Decoder                 Precoder                 Map
16        SPRA159
                                                                                      V.34 Receiver
3 V.34 Receiver
        The goal of the V.34 receiver is to sample the sequence of 2D points that was
        transmitted by the V.34 transmitter and to perform the inverse of the
        transmitter’s encoder operations on this sequence. This goal is simple in
        theory but difficult to achieve in practice because the transmitted QAM signal
        is distorted by a variety of nonlinearities during its journey to the receiver. In
        typical modem connections, these distortions result from traversing
        numerous consumer phone lines and switching stations. For the
        transmitter-receiver pair of this project, the distortions arise as the
        transmitted signal travels through a coaxial cable, into an oscilloscope, and
        finally to the receiver. Clearly, the distortions experienced by this signal are
        (presumably) less than those found on a standard phone line, but they are
        significant nonetheless.
        The ideal situation for any modem receiver is to sample the exact
        transmitted analog signal at the precise symbol rate. By doing so, the
        sequence of 2D symbols can be retrieved exactly. There are two primary
        problems that prevent the receiver from doing this directly: first, the sampling
        rate of the receiver may not match exactly the symbol rate of the transmitter;
        and second, the received signal may not have the same “shape” as the
        transmitted signal (i.e., even if the sampling rate were exact, the sampled
        points would not be exactly those transmitted). The first problem is caused
        by small, but significant differences in the environment of the transmitter and
        the receiver. Temperature differences, clock crystal differences, and
        numerous other factors contribute to slight mismatches in the symbol rates
        between two modems. The second problem is due to the channel distortions
        mentioned above. These variations are unpredictable characteristics of the
        channel through which the signal is transmitted.
3.2   Demodulate
         Given an out-of-sync and distorted analog signal, it is the responsibility of
         the demodulate unit to produce solid, accurate data for the decoder unit.
         This is accomplished in two separate steps. First, the demodulate unit must
         compensate for any errors in the signal’s phase. This is accomplished by the
         symbol clock recovery section of the demodulator. This section of the
         demodulator continuously modifies the sampling rate of the analog interface
         chip (AIC) to ensure that the samples taken by the analog-to-digital (A/D)
         converter will be as accurate as possible. The second problem, that of
         channel distortion, is handled in the adaptive equalizer section of the
         demodulator. This section maintains an array of coefficients which, when
         convolved with the analog input signal, enables the demodulator to undo the
         effects of channel distortion. Between these two tools, the demodulator is
         able to transform an analog input into digital points for use by later parts of
         the receiver.
18       SPRA159
                                                                                   V.34 Receiver
                                          Hu(w)
                                                               Gu(1/7200 + t)
       r(t)               Inpbuffer
                A/D                                        *                        Im(Gu * Gl)
                                                               Gl(1/7200 + t)
                                                                                            v(n) - Timing
                                                                                             Phase Error
                                          Hl(w)
                                                                                     AC & DC
                                                                                    Filter / Hard
                                                                                      Limiter
                                                                                            v(n)
                                                                                +
                 Threshold
                                      +           Delay (1 Baud)
                 Detector
                    dB
                    dh
                         + dhr 1    dhi1
                                                       NJ
                            dB ) j * dB + –2 * E e(nT) * inpbuffer                  ǒ ǓNj
                                                                                    nT–m T
                                                                                         3
          where e(nT) is the passband error, inpbuffer is the 144-word buffer where
          the sampled received data are stored, n is the time instant, T is one over the
          baud rate, and m runs from 0 to 143.
20        SPRA159
                                                                                     V.34 Receiver
    In the equalize routine, found in the file eq.asm, the real filter hr1 is initialized
    as a unit impulse function, so that its coefficients are zero everywhere except
    for the center filter element, which is a one. The imaginary filter, hi1, is
    initialized to be the Hilbert transform of hr1. First, the data in the input buffer
    (inpbuffer) is convolved with real and imaginary filter taps. The convolved
    output, represented in the program by two arrays sigR and sigI such that the
    output is equal to (sigR + j*sigI), is then demodulated. This is accomplished
    by performing a complex multiplication with the angle supplied by the carrier
    tracking loop. The demodulated output (modR + j*modI) is then quantized
    to the nearest ideal constellation point (cnR+j*cnI) in the slicer routine. Next,
    the baseband error (diffR+ j*diffI) is computed by finding the difference
    between the actual and the ideal points. Finally, this error is remodulated,
    by again using the output from the carrier tracking loop to form the passband
    error (errorR +j*errorI).
    The routine eq_adapt, also found in eq.asm, uses the computed passband
    error to update the filter taps (hr1 + j*hi1) in double precision. A block
    diagram of the tap update process is shown in Figure 8. During any given
    baud, only 36 real and imaginary filter taps of the 144 total real and imaginary
    filter taps are updated. This is to ensure that the routine never runs for more
    than one-third of a baud (2777 cycles). After testing the modem, it was noted
    that by the time the initial training period is finished, the taps have become
    very stable, and the updating can be done even less frequently.
Input       144-Tap
                                         X                                         Decode
           Adaptive                                        Slicer
                                                                                    Unit
           Equalizer
  144                      144
 Real                      Imaginary                          -
          Tap Update                    Carrier
                                       Tracking
                                         Loop
                  Inpbuffer
                                                 fr                 hr1
                                   144 Point                                  Spectral
                                                 fi   Unscrambler   hi1
                                  Complex FFT                                 Division
                                                                             fr          fi
            Real Equalizer
                Coefficients
                                                                             144-Point
                                 Circular Copy        Unscramble2
                                                                           Complex IFFT
         Imaginary Equalizer
                Coefficients
22       SPRA159
                                                                                      V.34 Receiver
                                               48) * D(i)Ȁ
          C(i)   + D(i mod 48) ) D(48B()i mod
                                2          i mod 48) ) D(96 ) i mod 48)
                                                         2                        2
       where i runs from 0 to 143, and B(i) is the FFT of the periodic sequence
       transmitted by the transmitter. Next, the FFT routine is used to compute the
       inverse Fourier transform of C to get the equalizer coefficients in time
       domain. Then, unscramble2 is called to unscramble the FFT output, fr. The
       output is placed into gr, which is copied into hr1 in a circular fashion, placing
       the maximum value from gr into the center of hr1. Finally, unscramble2 is
       called again on fi, and the result, gr, is copied circularly into hi1 such that it
       is rotated by the same amount. Through this process, the fast equalizer is
       able to completely fill all real and imaginary coefficients in a single pass.
3.3   Decoder
       The output of the demodulate unit is a sequence of noise-corrupted 2D
       points. The first step of the decoder unit is to decide to which ideal
       constellation points these distorted points correspond. This task is
       accomplished by the Viterbi decoder. Next, groups of eight points (i.e., one
       mapping frame) are passed through the inverse precoder and the inverse
       mapper to be decoded into the output data stream.
         The Viterbi algorithm is initiated once every time the receiver has
         demodulated two noise-corrupted 2D symbols (i.e., once every 4D symbol
         period). It is assumed that the receiver has been operating for some time,
         and that the Viterbi table has already been filled with valid data. The first step
         in the algorithm is to quantize the noise-corrupted points to the closest point
         in each of the eight possible 4D subsets. The squared errors for each of
         these eight subsets, called the branch errors, are recorded. Next, the
         algorithm iterates through each of the 16 state entries in the Viterbi table that
         corresponds to the current 4D symbol. For each state, it seeks to update the
         cumulative path metric, described in the following paragraph.
         Each state has a previous cumulative path metric and a current cumulative
         path metric (which is currently being computed). These metrics are the total
         sum of all the individual branch errors that make up the trellis path
         terminating at that state. A small cumulative path metric indicates that the
         branches composing the path are very likely to be the correct ones. A large
         path metric indicates just the opposite. Since the path branches correspond
         directly to certain 4D symbols, finding the path with the smallest cumulative
         path metric will result in the most likely sequence of points.
         To compute the current cumulative path metric for a single state, the Viterbi
         algorithm checks the four possible state transitions into that state. Each of
         these transitions, or branches, has a branch error associated with it that was
         calculated at the beginning of the algorithm. The current cumulative path
         metric is chosen to be the minimum of the sums of the four previous path
         metrics and their associated branch errors. The Viterbi algorithm updates
         the current path metrics for all 16 states in this manner.
24       SPRA159
                                                                                   V.34 Receiver
      Once the current cumulative path metrics have been computed, the
      algorithm can determine a 4D output symbol. It does this by following the
      trellis path starting at the state with the smallest cumulative path metric back
      until it reaches the oldest state in the table. The 4D symbol located at this
      state then is output for further processing by the later stages of the decode
      unit.
      The assembly language routines for implementing the Viterbi algorithm are
      found in the source file viterbi.asm. Quantizing to the 4D subsets and
      calculating the branch errors are accomplished by the routines quantize4,
      calculate_errors, and calc_branch_errors. Updating the cumulative path
      metrics is done in two passes, eight states each, by the routines
      update_path_metrics and upm_finish. Tracing back the most likely trellis
      path is done by the function trace_back.
                      (I 3, I 2)   + [Z   m–1–Z m]mod 4,
                      (I 1, I 0)   + [Z   m–W m] mod 4,
to determine the bits (I3, I2, I1) for each received 4D point.
         The Q bits and the ring index of each point are found through the use of an
         inverse constellation point lookup table. This two-dimensional table is
         indexed by the x and y coordinates of a constellation point, and its entries
         contain the point indices of the point table in the transmitter. After consulting
         the lookup table, the lower q bits of the index are masked off to obtain the
         q Q bits. The remaining high-order bits after the first q constitute the ring
         index of the point, and they are stored in an array until a block of eight ring
         indices has been obtained. When eight ring indices have been stored, the
         end of a mapping frame has been reached, and the unshell-mapper can be
         called to compute the final K S bits of the original bit sequence. At this point,
         the eight groups of Q bits have been retrieved.
         Upon decoding the S, I, and Q bits, the inverse mapper reverses the
         operation of the parser to combine these groups of bits into one final output
         array. This array is then unscrambled using a descrambler algorithm with the
         same generator polynomial as that found in the transmitter. If all went well,
         the unscrambled output is the same sequence of bits that was transmitted.
         The inverse mapper routines are found in the source files invmap.asm and
         unshell.asm. The simple decoding of the differential encoder is done using
         the routine get_binary_subset_label to determine the rotation factor of the
         points and the routine compute_ibits to retrieve the I bits. The function
         find_point_index returns the index of the unrotated 2D constellation point as
         well as its uncoded Q bits and ring index. The routine unshell_map performs
         the inverse shell mapping algorithm to determine the S bits of the mapping
         frame.
26       SPRA159
                                                                                       Summary
4 Summary
    This project is implemented on two TMS320C50 EVM systems. It consists
    of two parts: a V.34 transmitter and a V.34 receiver. One-way communication
    links at speeds of 9.6, 14.4, and 19.2 kbps can be established, and the
    performance of the ’C50 DSP has been evaluated.
    Four distinct operational units comprise the V.34 transmitter. The parse unit
    transforms the input data stream into a form usable by the rest of the
    transmitter. The point-select unit chooses a sequence of constellation points
    in a manner that minimizes transmitted signal power and maximizes the
    probability of correct decoding. The precode unit precodes the transmitted
    points to compensate for a prediction error filter in the receiver and ensures
    that these precoded points still conform to a valid trellis sequence. The final
    unit, modulate, uses QAM to construct the final output waveform.
    The V.34 receiver has been similarly divided into distinct units: the
    demodulate front-end and the decode back-end. The demodulate unit is
    responsible for correctly sampling the input analog waveform at the symbol
    instants. It contains functions for symbol clock recovery as well as adaptive
    channel equalization. The decode unit performs the inverse of the
    transmitter’s encoding operations. It uses the Viterbi algorithm to determine
    the most likely trellis sequence.
    The operations required to perform the encoding and decoding functions for
    V.34 are ideally suited to fast DSPs such as the ’C50. By implementing the
    algorithms entirely in assembly language, it is possible to exploit the unique
    hardware features of the ’C50. Simulations of the transmitter program
    indicate that the DSP is idle (i.e., waiting for interrupts to occur) over 95%
    of the execution time. Although the task of the receiver is more complex
    computationally, even the most conservative estimates show that the DSP
    is working less than 42% of the time. These numbers demonstrate that the
    next phase of the implementation of a V.34 modem, the combination of the
    transmitter and receiver functions onto one DSP, is easily achievable on a
    single ’C50 DSP. Additional features of V.34, such as channel separation by
    echo cancellation techniques, could be accommodated with the excess
    processor time.
References
         1. ITU Study Group 14, “Draft Recommendation V.34 for a Modem
            Operating at Data Signalling Rates of up to 28800 bit/s for Use on the
            General Switched Telephone Network and on Leased Point-to-Point
            2-Wire Telephone-Type Circuits”, Document 57-E, June 1994.
         2. S. A. Tretter, “Fundamentals of Trellis Shaping and Precoding”,
            unpublished notes.
         3. R. Laroia, N. Farvardin, and S. A. Tretter, “On Optimal Shaping of
            Multidimensional Constellations”, IEEE Trans. Inform. Theory,
            Vol. IT-40, pp. 1044-1056, July 1994.
         4. L. F. Wei, “Trellis Coded Modulation with Multidimensional
            Constellations”, IEEE Trans. Inform. Theory, vol. IT-33, pp. 483-501,
            July 1987.
         5. S. A. Tretter, Communication System Design Using DSP Algorithms:
            With Laboratory Experiments for the TMS320C30, Plenum Publishing
            Corp., 1995.
         6. R. D. Gitlin, J. H. Hayes, S. B. Weinstein, Data Communication
            Principles, Plenum Publishing Corp, 1992.
         7. P. R. Chevillat, D. M. Maiwald, and G. Ungerboeck, “Rapid Training of
            a Voiceband Data-Modem Receiver Employing an Equalizer with
            Fractional-T Spaced Coefficients”, IEEE Trans. Commun., vol. COM-35,
            pp. 869±876, Sept. 1987.
         8. v.34 transmitter and receiver code at:
            ftp://ftp.ti.com/pub/tms320bbs/c5xfiles/apprep34.exe
28       SPRA159
                                                       Constellation Shaping by Shell Mapping
A-2       SPRA159
                                                                    Constellation Shaping by Shell Mapping
A. System Description
      The block diagram of the transmitter for a system using shell mapping for
      constellation shaping is shown in Figure VIII–1. The diagram shows the Wei
      16-state 4D channel code but can easily be generalized to use any channel
      code. As described in Section IV, the transmitted 2D constellation is a subset
      of z2+(1/2,1/2) and this constellation is partitioned into four 2D subsets A, B,
      C, and D. The Wei encoder takes in 3 bits every 4D symbol and generates
      4 output bits per 4D symbol. The first and second pairs of output bits specify
      the pair of 2D symbols that form the 4D subset selected by the Wei encoder.
               K Bits per                                      Blocks of N
            Shaping Block of                                  Ring Indices
             N 2D Symbols                                    (r1, r2, . . . , rN)
                                  Shell Mapping Ring
                                  Sequence Algorithm
        L   + 4M 2 + M 2 )
                      u          u 2                                                             (VIII–1)
      The ring closest to the origin should contain the 2u+2 constellation points of
      least energy. The next ring should contain the next 2u+2 points in order of
      energy, etc.
           2K    vM   N
                          or 2 KńN         vM                                                       (VIII–2)
           [2
                   ) )
                N(u 2) K
                           ]
                               ń
                               1 N
                                     +2 ) ) ń
                                           u 2 K N                                                  (VIII–4)
           CER s    + M 2 ) ń2 ) ) ń + M 2
                                     u 2   u 2 K N     ń
                                                     –K N                                           (VIII–5)
M v 1.5 2 ń K N (VIII–6)
Combining (VIII–2) and (VIII–6), we see that M must be limited to the range
          8.    G.D. Forney and L-F. Wei, “Multidimensional Constellations, Part I,” IEEE J. SAC,
                Vol. 7, No. 6, August 1989, p. 887.
A-4       SPRA159
                                                                Constellation Shaping by Shell Mapping
       L   +M 2 ) +M 2
                    u 2           n–p                                                      (VIII–10)
The number of uncoded bits must be positive, so from (VIII–9) we find that
       u +n*2*pw0
       or 0 v p v n–2                                                                       (VIII–11)
and n w 2
                                            v
      M to increase and results in greater complexity. For a block size of N=8,
      Motorola recommends using p        3.
B. Weights of Ring Blocks, the Basic Concept of Shell Mapping, and Some
   Data Tables Required for Efficient Implementations
      Suppose that the rings are labelled from 0 to M–1 with ring 0 being closest
      to and ring M–1 furthest from the origin. Let a block of ring indices be denoted
      by
       r   + [r , r , @@@ , r ]
                1   2        N
                                                                                           (VIII–12)
          with r iŮ               AAA
                   {0, 1, , M–1} for i =1, ... , N. Each ring must be assigned an integer
          weight W1 (ri ) where the subscript 1 indicates that the argument is a
          one-dimensional vector. For example, assuming that the constellation point
          coordinates are integers, the ring weight could be the average squared
          distance of points in the ring from the origin. Motorola7 suggests using the
          weight function w1(i) = i for i = 0, ... , M–1. Justification is given in
          Appendix VIII–A. The ring weights must form a nondecreasing sequence,
          that is,
           w N( r )   +   ȍ+
                          N
                          i 1
                                w 1(r i)                                                      (VIII–14)
          The basic idea behind shell mapping is that the MN possible ring blocks are
          arranged in an ordered list with lower weight blocks appearing closer to the
          beginning of the list. There may be many blocks with the same weight and
          these are called a shell. We will see how to lexicographically order the blocks
          in a shell. The block at the beginning of the list will be labelled or indexed by
          0 and the one at the end by MN – 1. The 2K blocks closest to the beginning
          of the list which are also the 2K lowest weight blocks are used as the shell
          sequences. Encoding is performed by using the binary K-tuples of shaping
          bits as the indexes of the list elements. Decoding is performed, basically, by
          observing that given a ring block, its index is the number of blocks below it
          on the list. We will see how to achieve reasonable table storage
          requirements by a “divide and conquer” approach where the N dimensional
          problem is divided into two N/2 dimensional problems, etc. The problem,
          then, is to efficiently assign shaping K-tuples to ring blocks and vice versa.
                                                                               + NJr AAA , r Nj
          To perform shell mapping, the number of vectors with a given weight must
          be known. Let Mp(j) be the number of p-tuples of ring indices r
                                                 ǒ Ǔ ) AAA ) w ǒr Ǔ. The weight generating
                                                                                    1,    p
                           ȍ
          function is defined to be
          For example, with the Motorola weight function w1(i)=i, the number of
          1-tuples of weight j is M1(j) = 1 for j = 0, ... , M–1 and is 0 otherwise. The
          corresponding generating function is
A-6       SPRA159
                                                                Constellation Shaping by Shell Mapping
 G 1 (z)   +
                k
                    ȍ+
                    M–1
                          0
                              zk    + 11–z–z
                                           M                                               (VIII–16)
M 2p (j)   +   ȍ
               +
                j
               k 0
                         M p(k) M p(j–k)                                                   (VIII–17)
 G 2p (z)      +G         2
                          p   (z)                                                          (VIII–18)
 C N (j)   +        ȍ+
                     j
                    k 0
                              M N (k)                                                      (VIII–19)
          Binary shaping K-tuples can be assigned to ring blocks in many ways. The
          method we will examine is based on a splitting algorithm that successively
          divides blocks into pairs of half the length until blocks of length 1 are
          reached. Therefore, we will assume that the original block length is a power
          of 2, that is, N = 2q.
          At each step, the i-tuples of rings will be ordered according to rules which
          will be given shortly. The decoding, ordering, or indexing function on i-tuples
          will be called Di(·).
          The ordering of 1-tuples is easy. According to (VIII–13), the ring weights
          must form a nondecreasing sequence. Therefore, 1-tuples will be listed in
          numerical order. That is, the one-dimensional decoding function is
          This is the starting point for building higher dimensional decoding functions.
          As a matter of notation, let an i-tuple of ring indexes be
           r [i ]    + ƪr , ..., r ƫ
                           1          i                                                   (VIII–22)
           r
               [i]
               1
                       + [r , AAA ,
                               1          r iń2] and r
                                                         [i]
                                                         2
                                                               + [r ń ) , AAA , r ]
                                                                     i 2 1       i        (VIII–23)
          We will now see how to define a decoding function for i-tuples in terms of one
          for i/2-tuples. Assume Di/2(·) is known. First, order i-tuples of rings according
          to the following rules for the three possible cases:
A-8       SPRA159
                                                                      Constellation Shaping by Shell Mapping
Case 3: Equal Weight i-tuples. Indexes of 1st Halves are the same
              ǒ Ǔ               ǒ Ǔ
When w i u [i ] = w i v [i ] and u 1 [i ] = v 1 [i ] , then list u [i ] below v [i ]                  if
        [i]
D iń2 (u )
        2
                t      [i]
               D iń2 (v )
                       2 .
     ǒ Ǔ+
  D i v [i]     {number of i-tuples below v [i]}
              + {number of i-tuples u [i] with w i(u [i])            t w (v )}    [i]
                           ƪ ǒ Ǔ ƫ) ǒ Ǔ
              and u [i] below v [i]}
              +C      i     w i v [i ] – 1          N i v [i ]
The quantities Di(·) for i < N do not have to be computed. Also, Ci(·) is needed
only for i=N. Finally, we will see in the next paragraph how to recursively
compute Ni(·) for i = 2, 4, ... , N using Ni/2(·) and Mi/2(·). Then DN(·) can be
computed by (VIII–25) for i=N.
The vectors counted in Ni(v[i]) can be partitioned into the following three
types:
                 NJŤ                                                                                                                 Nj
           Consider the set
                  u [i]     w i (u [i])                + w (v )   i
                                                                            [i]
                                                                                  Ɛ                 [i ]
                                                                                            w iń2 (u )
                                                                                                    1
                                                                                                             t wń   i 2
                                                                                                                             [i]
                                                                                                                           (v )
                                                                                                                             1
                                                                                                                                                                       (VIII–26)
                                       ȍ+
                                    [i]
                            w iń2 (v1 ) –1
k 0
       NJŤ
           Consider the set
           According to (VIII–24), the number of choices for u1[i] is Ni/2(v1[i]). The total
           weight must be wi(v[i]), so the number of choices for u2[i] is Mi/2[wi(v[i]) –
           wi/2(v1[i])]. Thus the number of type 2 vectors is
                 a2     + Nń      i 2
                                               [i]
                                               1
                                                                             [i]
                                             (v ) M iń2 [w i (v [i])–w iń2 (v )]
                                                                             1
                                                                                                                                                                       (VIII–29)
           NJ                                                                                                            ǒ Ǔ t ǒ ǓNj
           Type 3: 1st Halves Identical
           The number of vectors in the set
                                             ǒǓ
           is
                 a3     + Nń      i 2             v
                                                      [i]
                                                      2
                                                                                                                                                                       (VIII–31)
                          ǒǓ                                                                  ǒǓ
           Notice, also, that the weight of v [2i] is
                 w iń2      v
                                 [i]
                                 2
                                             +w             i
                                                                ǒv Ǔ * w ń
                                                                      [i]
                                                                                       i 2      v
                                                                                                    [i]
                                                                                                    1
                                                                                                                                                                       (VIII–32)
A-10      SPRA159
                                                                                      Constellation Shaping by Shell Mapping
                              ȍ+
                                  [i]
                                                                   ƪ ǒ Ǔ* ƫ
                          w iń2 (v1 )–1
            ǒ Ǔ+
                                      ǒ Ǔ ƪ ǒ Ǔƫ ǒ Ǔ
        N i v [i]                            M iń2 ( k ) M iń2 w i v [i]          k
                              k 0                                                                                (VIII–33)
                     )Nń    i 2        v
                                           [i]
                                           1
                                                 M iń2     W iń2       v
                                                                           [i]
                                                                           2
                                                                                 )Nń  i 2   v
                                                                                                [i]
                                                                                                2
        w 1 (r )    +r     + 0, AAA , M * 1
                              for r                                                                              (VIII–34)
                          NJ
       weight is
            v1     [2]
                         + ƪr ƫ   1     and v 2     [2]
                                                           + ƪr ƫ  2
        N2 vǒ Ǔ+
               [2]             ȍ+
                              ǒ Ǔ*1
                           w1 r1
                                           ƪ ǒ Ǔ* ƫ
                                             M 1 ( k ) M 1 w 2 v [2]              k
                                                                                                                 (VIII–36)
                                  ǒ Ǔ ƪ ǒ Ǔ * ǒ Ǔƫ ) ǒ Ǔ
                              k 0
)N 1 r 1 M 1 w 2 v [2] w1 r1 N1 r2
          Since there is a single 1-tuple of each weight, N1(r) = 0, so the last line of
          (VIII–36) is zero and
             N2 v ǒ Ǔ+
                    [2]         ȍ
                                r1
                                 +
                                k 0
                                     *1
                                              ǒǓ
                                          M1 k M1 r1 ǒ ) r * kǓ +
                                                              2
                                                                              ȍ
                                                                              r1
                                                                               +
                                                                              k 0
                                                                                   *1
                                                                                            ǒ ) r * kǓ
                                                                                        M1 r1    2
                                                                                                                 (VIII–37)
                                          NJ
          From Figure VIII–2 it can be seen that this sum is
             N2   ǒƪ ƫǓ +
                     r 1, r 2
                                              r1
                                              M    *1*r   2
                                                              for 0
                                                              for M
                                                                      vr )r vM*1
                                                                      vr )r vr )M*2
                                                                          1
                                                                          1
                                                                                        2
                                                                                        2    1
                                                                                                                 (VIII–38)
         0                           0                        M1(0) = 1
         1                           1                        M1(1) = 1
         2                           2                        M1(2) = 1
D2(r) r
         0                           00                       M2(0) = 1
   ––––––––––––           –––––––––––
         1                           01                       M2(1) = 2
         2                           10
   ––––––––––––           –––––––––––
         3                           02                       M2(2) = 3
         4                           11
         5                           20
   ––––––––––––           –––––––––––
         6                           12                       M2(3) = 2
         7                           21
   ––––––––––––           –––––––––––
         8                           22                       M2(4) = 1
D4(r) r
         0                       0000                         M4(0) = 1
   ––––––––––––           –––––––––––
         1                       0001                         M4(1) = 4                              C4(0) = 1
         2                       0010
         3                       0100
         4                       1000
A-12      SPRA159
                                                         Constellation Shaping by Shell Mapping
––––––––––––   –––––––––––
     5            0002                 M4(2) = 10                        C4(1) = 5
     6            0011
     7            0020
     8            0101
     9            0110
    10            1001
     11           1010
    12            0200
    13            1100
    14            2000
––––––––––––   –––––––––––
    15            0012                 M4(3) = 16                        C4(2) = 15
    16            0021
    17            0102
    18            0111
    19            0120
    20            1002
    21            1011
    22            1020
    23            0201
    24            0210
    25            1101
    26            1110
    27            2001
    28            2010
    29            1200
    30            2100
––––––––––––   –––––––––––
    31            0022                 M4(4) = 19                        C4(3) = 31
    32            0112
    33            0121
    34            1012
    35            1021
    36            0202
    37            0211
    38            0220
    39            1102
    40            1111
        41               1120
        42               2002
        43               2011
        44               2020
        45               1201
        46               1210
        47               2101
        48               2110
        49               2200
   ––––––––––––       –––––––––––
        50               0122            M4(5) = 16   C4(4) = 50
        51               1022
        52               0212
        53               0221
        54                1112
        55               1121
        56               2012
        57               2021
        58               1202
        59               1211
        60               1220
        61               2102
        62                2111
        63               2120
        64               2201
        65               2210
   ––––––––––––       –––––––––––
        66               0222            M4(6) = 10   C4(5) = 66
        67               1122
        68               2022
        69               1212
        70               1221
        71               2112
        72               2121
        73               2202
        74               2211
        75               2220
   ––––––––––––       –––––––––––
        76               1222            M4(7) = 4    C4(6) = 76
A-14      SPRA159
                                                                      Constellation Shaping by Shell Mapping
    77             2122
    78             2212
    79             2221
––––––––––––   –––––––––––
    80             2222                         M4(8) = 1                              C4(7) = 80
––––––––––––   –––––––––––
                                                                                       C4(8) = 81
M1 (r1 + r2 – k)
                r1 + r2 – (M – 1)               0           r1 – 1           r1 + r2
                                                    k
(a) 0 ≤ r1 + r2 ≤ M – 1
M1 (r1 + r2 – k)
                       0    r1 + r2 – (M – 1)               r1 – 1           r1 + r2
                                                    k
(b) M ≤ r1 + r2 ≤ r1 + M – 2
                ǒ Ǔ + C ƪ w ǒv Ǔ * ƫ ) N ǒv Ǔ
            D N v [N]        N   N
                                         [N]
                                               1   N
                                                       [N]                            (VIII–39)
A-16      SPRA159
                                                                                   Constellation Shaping by Shell Mapping
Remember that NN(v[N]) is the number of blocks with the same weight as v[N]
that are below it on the list. Also, MN(wN(v[N])) is the number of blocks with
the weight of v[N]. Therefore,
 0   vN   N
               ǒv Ǔ t M ƪw ǒv Ǔƫ
                [N]
                                 N          N
                                                 [N]                                                          (VIII–40)
Also,
         ƪ ǒ Ǔƫ + ƪ ǒ Ǔ * Ǔ ) ƪ ǒ Ǔƫ
 C N w N v [N]                   C N w N v [N]               1             M N w N v [N]                      (VIII–41)
Thus,
         ƪ ǒ Ǔ ƫ v ǒ Ǔ t ƪ ǒ Ǔƫ
 C N w N v [N] –1                       D N v [N ]               C N w N v [N ]                               (VIII–42)
This shows that the index DN will always fall in an interval bracketed by a pair
of successive CN’s.
Encoding Step 1: Compute the Weight of v[ N]
Based on (VIII–42), the weight of v[ N] is
         ǒ Ǔ+
 w N v [N]
                      max
                       n n       NJŤ     CN (n       * 1) v D           N
                                                                           ǒv ǓNj
                                                                             [N]                              (VIII–43)
         ǒ Ǔ + D ǒv Ǔ * C ƪ w ǒv Ǔ * ƫ
 N N v [N]               N
                                  [N]
                                                N        N
                                                                 [N]
                                                                             1                                (VIII–44)
According to (VIII–33),
 N ǒv
                ȡ
                ȧ ǒ Ǔ*
           Ǔ + ȥȧ ȍ ń
                      w Nń2 v 1[N]      1
                                                                    ƪ ǒ Ǔ   ƫ ȣ
                                                                          * ȧȦȧ
                Ȣ +                                               ń
                                                                                             Ȥ
         [N]
                                            MN          (k ) MN            w N v [N]     k                    (VIII–45)
     N                                              2                  2
             )NJ ń ǒ Ǔ ń ƪ                                    ǒ Ǔ ƫNj
                             k 0
NN 2 v [1N ] M N 2 W Nń 2 v ( N )
             ) NJ ń ǒ ǓNj
                      NN     2    v 2[N]
               Of the weight wN(v[N]) N-tuples, the number with the same weight as v[N] in
                                                               +
                          ƪ ǒ Ǔƫ ƪ ǒ Ǔƫ
               the first half, that is, w Nń2(u [N]
                                                1
                                                   ) w Nń2(v [i]
                                                             1
                                                                ), is
                                          [N]                                       [N ]
                M Nń 2      w Nń 2    v              M Nń 2       w Nń 2        v                                            (VIII–46)
                                          1                                         2
           0   vN ń N 2
                          ǒ Ǔ          ƪ ǒ Ǔƫ ) ǒ Ǔ t ƪ ǒ Ǔƫ ƪ ǒ Ǔƫ
                           v [1N] M Nń2 w Nń2 v [2N]           N Nń2 v [2n]         M Nń2 w Nń2 v 1[N]   M Nń2 w N v 2[N]    (VIII–47)
       NJ                             ƪ ǒ Ǔƫ ) ǒ ǓNj * ƪ ǒ Ǔƫ ƪ ǒ Ǔƫ
               Rearranging this last inequality gives
                  ǒ Ǔ
           N Nń2 v 1[N] M Nń2 w Nń2 v [2N]                 N Nń2 v [2N]         M Nń2 w Nń2 v 1[N]       M Nń2 w N v 2[N]    (VIII–48)
                   ȡ
                + ȧȥ        ǒ Ǔ*
                                           ń   ǒ Ǔ*
                                               ȍ+
                                      w N 2 v [N]
                                                                                      ƪ ǒ Ǔ * ƫȣȧȦȧ
                   ȧȢ
                                                       1
                                              1
                     NN      v [N]                         M Nń2 ( k ) MNń2 wN v [N]                 k
                                                                                                         Ȥ
                           ƪ ǒ Ǔ ƫ ƪ ǒ Ǔƫ t
                                               k 0
                                               NJ Ťȍ
               From the inequality (VIII–48), we see that the weight of the first half of v[N]
                                                                                                                         Nj
               must be
                          ǒ Ǔ + maxn
                W Nń2 v 1[N]                    n
                                                     n–1
                                                      +
                                                     k 0
                                                                     ƪƫ              ƪ ǒ Ǔ * k ƫ v N ǒv Ǔ
                                                            M N ń2 k M N ń2 w N ń2 v [ N ]                    N
                                                                                                                   [N]       (VIII–49)
                          ǒ Ǔ+
                w Nń2 v [2N]           w N v [N]ǒ Ǔ*w ń        N 2
                                                                     ǒ Ǔ
                                                                       v [1N]                                                (VIII–50)
A-18        SPRA159
                                                                                    Constellation Shaping by Shell Mapping
                 ǒ Ǔ ń ƪ ń ǒ Ǔƫ ) ń ǒ Ǔ
Now the following partial offset d can be computed:
                                  ǒ Ǔ*
      +N   N
                ǒv Ǔ *
                 [N]              ȍ+
                           w Nń2 v 1[N]
                                  k 0
                                             1
Encoding Step 5: Compute the Offsets of the 1st and 2nd Halves
Remember that N Nń2 (v 2[N]) is the number of weight w Nń2 (v [2N]) N/2-tuples
below v [2N]. The total number of N/2-tuples with weight w Nń2 (v [2N]) is
M Nń2 [w Nń2 (v [2N])], so
 0   vN ń N 2
                ǒ Ǔ t ń ƪ ń ǒ Ǔƫ
                 v [2N]      MN        2   wN     2     v [2N]                                                 (VIII–52)
Using the same reasoning, we see that the following inequality must also be
true:
 0   vN ń N 2
                ǒ Ǔ t ń ƪ ń ǒ Ǔƫ
                 v [1N]      MN        2   wN     2     v [1N]                                                 (VIII–53)
Using the Euclidean division algorithm, N Nń2 (v [2N]) and N Nń2 (v 1[N]) can be
found by dividing d by M Nń2 [w Nń2 (v 2[N])]. N Nń2 (v 1[N]) is the remainder and
N Nń2 (v [2N]) is the quotient.
          ǒǓ                                       ƪ ǒ ǓƫȣȦȤ
Thus, to complete step 5, compute the following:
                                  ȡȥ
                       + int dńM ń
                                   Ȣ
               [N]                                                         [N]
 N Nń 2    v                                     N 2     w Nń 2       v                                        (VIII–54)
               1                                                           2
          ǒǓ                                      ƪ ǒ Ǔƫ ǒ Ǔ
and
 N Nń 2    v
               [N]
               2
                       + d*M ń             N 2        w Nń 2     v
                                                                     [N]
                                                                     2
                                                                                 N N ń2    v
                                                                                               [N]
                                                                                               1
                                                                                                               (VIII–55)
                      NJ
          can be found from (VIII–38). Let v = [r1,r2] so w2(v) = r1 + r2. Then from the
          first part of Example 1, it follows that
                 +    N 2 (v )                                                   for w 2 (v )vM*1
            r1        W 2 (v )            )N       2   (v )   * (M *          1) for M 1   * t W (v )      2
                                                                                                                      (VIII–56)
r2 +w 2 (v ) *r 1
                 ǒ Ǔ+                                  ǒ Ǔ+                                ǒ Ǔ+
          Thus
          w 2 r [4]
                1
                                    2 and w 2 r [4]
                                                1
                                                                               ǒ Ǔ*
                                                                           w 4 r [4]   w 2 r [4]
                                                                                             1
                                                                                                           2   *2+0
          Step 4
          The partial offset is d = 8 – 7 = 1
A-20      SPRA159
                                                                                            Constellation Shaping by Shell Mapping
Step 5
Find Offsets of 1st and 2nd Halves
     ǒ ǒ ǓǓ +
M2 w2 r2            [4]
                                M 2 (0)        +1
N2 r1ǒ Ǔ + NJ d ń M ǒ w ǒ r ǓǓ +
          [4]
                          int         2        2        2
                                                             [4]
                                                                                  1
N2   ǒr Ǔ + d * M ǒw ǒr ǓǓ N ǒr Ǔ +
      2
          [4]
                                  2       2        2
                                                       [4]
                                                                    2         1
                                                                                  [4]
                                                                                            1   *1     1   +0
Using (VIII–56) we find that
r1   +N         2
                    ǒr Ǔ +
                     1
                          [4]
                                 1,       r2   +w            2
                                                                   ǒr Ǔ * r
                                                                    1
                                                                        [4]
                                                                                        1   +2*1+1
r3   +N         2
                    ǒr Ǔ +
                     2
                          [4]
                                 0,       r4   +w            2
                                                                   ǒr Ǔ * r
                                                                    2
                                                                        [4]
                                                                                        3   +0*0+0
So the encoded ring block is r[4] = [ 1100 ]
Appendix VIII-A
          Justification for using the Motorola weight function is given in this appendix.
          Suppose that the rings are formed from M concentric circles as shown in
          Figure VIII–3.
                                                                  RM – 1   RM = R
                                                        R1
                                                             R2
M–1
A0 + pR 21 + pR ń M 2 (VIII–57)
so
             2
            R1   +R ńM 2                                                                (VIII–58)
A-22      SPRA159
                                                     Constellation Shaping by Shell Mapping
 Ak   + pR 2K ) 1 + (k ) 1) A + (k ) 1) p R 21
                                   0
                                                                                (VIII–59)
so
      ) 1 + (k ) 1) R 1 + (k ) 1) R ń M
  2                   2                     2                                   (VIII–60)
 RK
           Pk    +   ŕ
                     R k) 1
                     Rk
                              r 2 dA
                                  Ak
                                       +    ŕ ǒ
                                           R k) 1
                                           Rk
                                                    r2
                                                         p Rk
                                                               2
                                                                   2pr dr
                                                                   )1
                                                                              2
                                                                           – Rk    Ǔ   (VIII–61)
Rk ) 1 ) Rk
                       ǒ                        Ǔ
                      )
                           4               4               2                   2
                          Rk
                        1 – Rk
                 + 2            +      2
                  2 Rk ) 1 – Rk
                              2
            Pk   + RM (k ) 0.5)
                       2
                                                                                       (VIII–62)
                                                ǒ                          Ǔ
          The average power in a ring block r= [r1,r2,...,rN ] is
                     +ȍ                +                   )ȍ
                           N                                    N
            P(r )               P rk       R2       0.5N              rk               (VIII–63)
                      +   k 1
                                           M
                                                            +  k 1
          Thus, any ring sequence with the same sum has the same average power.
          This justifies using the sum of the ring values as a weight function.
A-24      SPRA159
                                            Appendix B.
       Selected excerpts from Dr. Steven A. Tretter’s “Fundamentals of Trellis Shaping and
       Precoding” on the subject of obtaining the inputs to the convolutional encoder. Used
       with permission.
A.30   A Method for Determining the Binary Subset Label From the
       Coordinates of a 2D Point
       The combined precoding and trellis coding scheme for V.34 will be
       presented in Chapter X. Finding the input to the trellis encoder involves
       finding the subset binary labels for 2D constellation points. A simple method
       for finding these labels will now be presented.
       Let (x,y) be a 2D constellation point. First translate this point to a lattice point
       by forming
According to (IX–8)
       The sum of the components of any point in 2RZ2 is some even number 2n.
       Thus,
               x0   ) y + 2n ) 2ƪJ ę ǒJ @ J Ǔƫ ) 2J ) J
                          0                  2       0       1   1     0                  (IX–11)
                     J0   + modǒx ) y , 2 Ǔ
                                        0        0                                        (IX–13)
J0 = 0 J0 = 1
J1 = 0 J1 = 1 J1 = 0 J1 = 1
J2 = 0 J2 = 1 J2 = 0 J2 = 1 J2 = 0 J2 = 1 J2 = 0 J2 = 1
 a                     A                c                                C                   b                      B           d                     D
 2 RZ2        2 RZ2 + (2, 0)         2 RZ2 + (1, 1)           2 RZ2 + (3, 1)            2 RZ2 + (2, 1)      2 RZ2 + (0, 1)   2 RZ2 + (1, 2)    2 RZ2 + (3, 2)
                            + 2n ) 2ƪ J ę ǒJ                                       Ǔƫ ) J
            Therefore, it follows from (IX–10) that
x0 1 2 0 ·J 1 1 (IX–14)
                      J1   + modǒx , 2 Ǔ + lsb of x
                                            0                                           0                                                     (IX–15)
            Once J0 and J1 have been determined, their effects can be subtracted out
            to form
                 ǒx , y Ǔ + ǒx , y Ǔ – J (1,2 1) – J (0, 1)
                      1      1
                                             0        0              1                         0
                                     ) ƪJ ę ǒJ                     Ǔƫ (2, 0) + RZ ) ƪJ ę ǒJ ·J Ǔƫ (1, 0)
                                                                                                                                              (IX–16)
                 Ů
                          2RZ 2             2            0    ·J 1
                                                                                                       2
                                2                              2     0  1
The sum of the coordinates of any point in RZ 2 must be some even number
                       ) y + 2n ) ƪJ ę ǒJ                                           Ǔƫ
            2n3. Therefore,
x1 1 3 2 0 ·J 1 (IX–17)
so
B-2         SPRA159
  ƪ ę ǒ Ǔƫ + ǒ
      J2      J 0 ·J 1       mod x 1   ) y , 2Ǔ
                                            1                                   (IX–18)
and
  J2       + modǒx ) y , 2 Ǔ ę ǒJ
                         1   1          0   ·J 1Ǔ                               (IX–19)