0% found this document useful (0 votes)

15 views6 pages

Tarver 2019

The document discusses using neural networks for digital predistortion to correct nonlinearities in power amplifiers. It proposes a novel neural network training method that avoids indirect learning architectures and shows improvements in adjacent channel leakage ratio and error vector magnitude. It also shows that a neural network based predistorter can achieve lower latency, higher throughput and fewer multiplications per sample compared to a similarly performing memory polynomial implementation on an FPGA.

Uploaded by

KILANI Mounir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views6 pages

Tarver 2019

Uploaded by

KILANI Mounir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

2019 IEEE International Workshop on Signal Processing Systems

Design and Implementation of a Neural Network

Based Predistorter for Enhanced Mobile Broadband
Chance Tarver∗ , Alexios Balatsoukas-Stimming†‡ , and Joseph R. Cavallaro§
∗§ Departmentof Electrical and Computer Engineering, Rice University, Houston, TX, USA
† Department of Electrical Engineering, Ecole polytechnique fédérale de Lausanne Lausanne, Switzerland
‡ Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands

Email: ∗ tarver@rice.edu, † a.k.balatsoukas.stimming@tue.nl, § cavallar@rice.edu

Abstract—Digital predistortion is the process of using digital DPD coefficients, it has been shown to converge to a biased
signal processing to correct nonlinearities caused by the analog solution due to noise in the PA output [8, 9]. Moreover, the
RF front-end of a wireless transmitter. These nonlinearities LS problem is often poorly conditioned [4]. In [10], a mobile
contribute to adjacent channel leakage, degrade the error vector
magnitude of transmitted signals, and often force the transmitter graphics processing units (GPU) was used to implement the
to reduce its transmission power into a more linear but less polynomial DPD with I/Q imbalance correction from [4]. This
power-efficient region of the device. Most predistortion tech- GPU implementation used floating-point and was able to avoid
niques are based on polynomial models with an indirect learning the challenges associated with the dynamic range requirements
architecture which have been shown to be overly sensitive to for memory polynomials. When implemented on an FPGA, a
noise. In this work, we use neural network based predistortion
with a novel neural network training method that avoids the memory polynomial can be challenging due to the bit-widths
indirect learning architecture and that shows significant improve- that are necessary to perform the high-order exponentiation in
ments in both the adjacent channel leakage ratio and error vector fixed-point precision [11].
magnitude. Moreover, we show that, by using a neural network The overall DPD challenge has strong similarities to the
based predistorter, we are able to achieve a 42% reduction in problems encountered in in-band full-duplex (IBFD) commu-
latency and 9.6% increase in throughput on an FPGA accelerator
with 15% fewer multiplications per sample when compared to a nications [12–14], where a transceiver simultaneously trans-
similarly performing memory-polynomial implementation. mits and receives on the same frequency, increasing the
Index Terms—Digital predistortion, neural networks, FPGA. spectral efficiency of the communication system. However,
this requires (among other techniques) digitally removing the
I. I NTRODUCTION significant self-interference from the received signal which
not only consists of the intended transmission but also the
Efficiently correcting nonlinearities in power amplifiers
nonlinearities added by the imperfections in the transmit chain
(PAs) through digital predistortion (DPD) is critical for en-
including the PA. In [15], the author used neural networks
abling next-generation mobile broadband where there may be
(NNs) to perform the self-interference cancellation and found
multiple radio frequency (RF) transmit (TX) chains arranged
that it could achieve similar performance to polynomial based
to form a massive multiple-input multiple-output (MIMO)
self-interference cancellation. This work was later extended
system [1], as well as new waveforms with bandwidths on the
to create both FPGA and ASIC implementations of the NN-
order of 100 MHz in the case of mmWave communications [2].
based self-interference canceller [16]. It was found that, due
Traditional DPDs use variations of the Volterra series [3], such
to the regular structure of the NN and the lower bit-width
as memory polynomials [4, 5]. These models consist of sums
requirements, it can be implemented to have both a higher
of various order polynomials and finite impule responce (FIR)
throughput and lower resource utilization.
filters to model the nonlinearities and the memory effects in a
Inspired by the full-duplex NN work and the known prob-
PA, respectively.
lems of polynomial based predistortion with an ILAs, we
To learn the values of the parameters in a polynomial based
recently proposed in [17] to use NNs for the forward DPD
model, an indirect learning architecture (ILA) is typically used
application. The NNs are a natural choice for such application
in conjunction with some variation of a least squares (LS) fit
as they are able to approximate any nonlinear function [18],
of the data to the model [5]. In an ILA, a postinverse model
making them a reasonable candidate for predistortion. The idea
of the predistorter is fitted based on the output of the PA [6,
of using various NNs for predistortion has been explored in
7]. After learning the postinverter, the coefficients are copied
many works [19, 20]. However, the training method is unclear
to the predistorter. Although this simplifies the learning of
in [19], and their implementations require over ten thousand
The work of C. Tarver and J. R. Cavallaro was supported in part by the parameters. In [20], the training of the NN is done using an
U.S. NSF under grants ECCS-1408370, CNS-1717218, and CNS-1827940, ILA which can subject the learned predistorter to the same
for the “PAWR Platform POWDER-RENEW: A Platform for Open Wireless problems seen with all ILAs.
Data- driven Experimental Research with Massive MIMO Capabilities.” The
work of A. Balatsoukas-Stimming was supported by the Swiss NSF project Contribution: In our previous work [17], we avoided the
PZ00P2 179686. standard ILA and we improved the overall performance by

978-1-7281-1927-4/19/$31.00 ©2019 IEEE 296

Authorized licensed use limited to: CMU Libraries - library.cmich.edu. Downloaded on July 08,2020 at 03:53:10 UTC from IEEE Xplore. Restrictions apply.
Input Hidden Output
x̂[n]
x[n] NN DPD, Ĥ −1 PA, H y[n] layer layers layer

(x) (y)
1
Training N

...
G
(x) (y)
PA NN Model, Ĥ

Figure 2. General structure of the DPD and PA neural networks. There are
Figure 1. Architecture of the NN DPD system. The signal processing two input and output neurons for the real and imaginary parts of the signal, N
is done in the digital baseband and focuses on PA effects. The DAC, neurons per hidden layer, and K hidden layers. The inputs are directly added
up/downconverters, and ADC are not shown in this ﬁgure, though their to the output neurons so that the hidden layers concentrate on the nonlinear
impairments are also captured. portion of the signal.

using a novel training algorithm where we first modeled the learn this relationship given training data, this turns out to
PA with a NN and then backpropagated through it to train a be difficult in practice [15]. As such, we implement a linear
DPD NN. We extend that work here to show that not only do bypass in our NN that directly passes the inputs to the output
we improve performance when compared to polynomial based neurons where they are added in with the output from the
DPD, but we do so with reduced implementation complexity. final hidden layer, as can be seen in Fig. 2. This way, the NN
Furthermore, to realize the gains of the NN DPD, we design entirely focuses on the nonlinear portion of the signal.
a custom FPGA accelerator for the task and compare it to our
own polynomial DPD accelerator. B. Training
Outline: The rest of the paper is organized as follows. In This work primarily focuses on the implementation and
Section II, we give an overview of our DPD architecture and running complexity of the DPD application, which consists of
methods. In Section III, we compare performance/complexity inference on a pre-trained NN. The training is assumed to be
tradeoffs for the DPD NN to polynomial based predistorters. In able to run offline and, once the model is learned, significant
Section IV, we compare FPGA implementations for memory updates will not be necessary and occasional offline re-training
polynomial and NN predistortion. Finally, in Section V we to account for long-term variations would be sufficient.
conclude the paper. In [17], we first use input/output data of the PA to train
a NN to model the PA behavior. We then connect a second
II. N EURAL N ETWORK DPD A LGORITHM OVERVIEW
DPD NN to the PA NN model. We treat the combined DPD
For the NN DPD system, we seek to place a NN based NN and PA NN as one large NN. However, during the second
predistorter inline with the PA so that the cascade of the two training phase, we only update the weights corresponding to
is a linear system, as shown in Fig. 1. However, to train a the DPD NN. We then connect the DPD NN to the real PA
NN, it is necessary to have training data, and in this scenario and use it to predistort for the actual device.
the ideal NN output is unknown; only the ideal PA output is The process of predistorting can excite a different region
known. To overcome this problem, we train a PA NN model to of the PA than when predistortion is not used. To account
emulate the PA. We then backpropagate the mean squared error for this, it is not uncommon in other DPD methods to have
(MSE) through the PA NN model to update the parameters in multiple training iterations. A similar idea is adopted in [17]
the NN DPD [17]. and in this work. Once training of the PA and the DPD is
performed, we then retransmit through the actual PA while
A. Neural Network Architecture
using the DPD NN. Using the new batch of input/output data,
We use a feed-forward NN that is fully-connected with we then can update the PA NN model and in turn refine the
K hidden layers, and N neurons per hidden layer. The DPD NN. An example of the iterative training procedure is
nonlinear activation applied in hidden layers is chosen to be shown in Fig. 3, where the MSE training loss is shown for
a rectified linear unit (ReLU), shown in (1), which can easily the PA NN model and the combined DPD-PA is shown for
be implemented with a single multiplexer in hardware. two training iterations.
ReLU(x) = max(0, x) (1) III. C OMPLEXITY C OMPARISON
The input and output data to the predistorter is complex- To evaluate the NN based predistortion, we present the
valued, while NNs typically operate on real-valued data. To formulation of both a memory polynomial and the NN. We
accommodate this, we split the real and imaginary parts of then derive expressions for the number of multiplications as a
each time-domain input sample, x(n), on to separate neurons. function of the number of parameters in the models. In most
Although PA-induced nonlinearities are present in the trans- implementations, multiplications are considered to be more
mitted signal, the relationship between the input and output expensive as they typically have higher latency and require
data is still mostly linear. Although in principle, a NN can more area and power. Additions typically have a minor impact

297

Authorized licensed use limited to: CMU Libraries - library.cmich.edu. Downloaded on July 08,2020 at 03:53:10 UTC from IEEE Xplore. Restrictions apply.
P
M Q
L
x̂(n) = αp,m x(n − m)|x(n − m)|p−1 + βq,l x∗ (n − l)|x∗ (n − l)|q−1 + c (2)
p=1, m=0 q=1, l=0
p odd q odd

NN MSE Training Loss tion, we get the following number of multiplications

P Q
1 1
0.02 PA NN Model nMUL, poly = 3nPAR, poly + (p + 5) + (q + 5) .
DPD–PA NN p=3,
2 q=3,
2
p odd q odd
0.02 (4)
Iteration 1 Iteration 2 Here, each complex coefﬁcient accounts for three multipli-
MSE

0.01 cation. The expression, x(n)|x(n)|p−1 is computed once for

each n over a given p and delayed in the design to generate
the appropriate value for each m. We note that |x(n)|p−1 can
2 p−1
0.01 always be simpliﬁed to ((x(n))2 + (x(n) ) 2 since p is
odd. This accounts for ( p−1
2 + 1) multiplications before being
multiplied by the complex-valued x(n) which adds 2 more
0
multiplications. The same is true for the conjugate processing.
0 10 20 30 40 50
B. Neural Network Predistortion
Epoch
The output of a densely connected NN is given by

(x(n))
h1 (n) = f W1 + b1 , (5)
Figure 3. Example of iterative NN-DPD training for two training iterations, (x(n))
where 20 and 5 epochs are used in the ﬁrst and second iteration, respectively.

hi (n) = f (Wi hi−1 (n) + bi ) , i = 2, . . . , K, (6)

on these metrics when compared to multiplications, so we omit
(x(n))
them from this high-level analysis. z(n) = WK+1 hK (n) + bK+1 + Wlinear , (7)
(x(n))

x̂(n) = z1 (n) + 1j · z2 (n), (8)

A. Memory Polynomial Predistortion
where f is a nonlinear activation function (such as the ReLU
An extension of a memory polynomial from [4] is shown from (1)), Wi and bi are weight matrices and bias vectors
in (2). This form of memory polynomial predistorts the corresponding to the ith layer in the NN, and j is the imaginary
complex baseband PA input x(n) to be x̂(n) by computing unit. The final output of the network after hidden layer K
nonlinearities of the form x(n)|x(n)|p and convolving them is given by (7) where the first element represents the real
with an FIR filter for both x(n) and its conjugate, x∗ (n). This part of the signal, and the second element represents the
conjugate processing gives the model the expressive power to imaginary part. In (7), Wlinear is a 2×2 matrix of the weights
combat PA nonlinearities and any IQ imbalance in the system. corresponding to the linear bypass. In practice, we fix it to
P and M are the highest nonlinearity order and memory depth be the identity matrix, I2 , to reduce complexity though these
in the main branch, while Q and L are the highest order weights could also be learned in systems with significant IQ
and memory in the conjugate branch. The complex-valued imbalance.
coefficients αp,m and βq,l represent the DPD coefficients that Assuming N neurons per hidden layer and K hidden layers,
need to be learned for nonlinearity orders p and q and memory the number of multiplications is given by
tap m and l. Finally, the DC term c accounts for any local
oscillator leakage in the system. nMUL, NN = 4N + (K − 1)N 2 . (9)
The total number of complex-valued parameters in (2) is C. Results
given as The performance results for each predistorter as a func-
tion of the number of required multiplications are shown in
P +1 Q+1 Figs. 4–6. These results were obtained using the RFWebLab
nPAR, poly = M +L + 1. (3)
2 2 platform [21]. RFWebLab is a web-connected PA at Chalmers
University. This system uses a Cree CGH40006-TB GaN PA
Assuming three real multiplications per complex multiplica- with a peak output power of 6 W. The precision is 14 bits

298

Authorized licensed use limited to: CMU Libraries - library.cmich.edu. Downloaded on July 08,2020 at 03:53:10 UTC from IEEE Xplore. Restrictions apply.
K=1 K=1
−30 3
K=2 K=2
M =1 M =1
M =2 M =2
ACLR (dB)

EVM (%)
M =4 M =4
−31 2.5

−32
2

0 20 40 60 80 100 120 0 20 40 60 80 100 120

Number of Real Multiplications Number of Real Multiplications

Figure 4. ACLR vs. number of multiplications for NN DPD (shown with Figure 5. EVM vs. number of real multiplications for NN DPD (shown with
diamonds) with up to K = 2 hidden layers and memory polynomial (shown diamonds) with up to K = 2 hidden layers and memory polynomial (shown
with circles) with up to M = 4 memory taps. This represents the out-of- with circles) with up to M = 4 memory taps. This represents the in-band
band performance of the predistorter. The stars represent design points that performance of the predistorter. The stars represent design points that we
we implement in FPGA in the next section. implement in FPGA in the next section

for the feedback on the ADC and 16 bits for the DAC. No DPD
Using their M ATLAB API, we test the NN predistorter using 0 P =9
a 10 MHz OFDM signal. This signal has random data on N = 20
600 subcarriers spaced apart by 15 kHz and is similar to LTE
PSD (dB)

signals commonly used in cellular deployments. It provides

an interesting test scenario in that it has a sufficiently high −20
peak-to-average power ratio (PAPR) to make predistortion
challenging. We train on 10 OFDM symbols then validate
and present experimental results based on averaging over 10
−40
different symbols. The Adam optimizer is used with an MSE
loss function and batches of 32 samples. ReLU activation
functions are used in the hidden layer neurons. −40 −20 0 20 40
Specifically, we tested the following DPDs: (1) a NN DPD Frequency (MHz)
with K = 1 with N = {1, ..., 20, 25, 31} (dark green), (2) a
NN DPD with K = 2 with N = {1, ..., 8} (light green), (3) a Figure 6. Example spectrum for the M = 4 polynomial and K = 1 NN.
polynomial DPD without memory and with P = 1 to P = 13 Each of these use around 80 multiplications per time-domain input sample to
the DPD.
(dark blue), (4) a polynomial DPD with M = 2 memory taps
and with P = 1 to P = 13 (light blue), and (5) a polynomial
DPD with M = 4 memory taps and with P = 1 to P = 13 where Pchannel is the signal power in the main channel, and
(pink). For each of these polynomials, no conjugate processing Padjacent is the signal power in the remainder of the band.
was used hence Q, L = 0. A predistorter with M = L = 4 In Fig. 4, we observe that the NN DPD offers similar perfor-
and Q = P was also evaluated. However, the system did not mance to the memoryless polynomial DPD for low numbers
have significant IQ imbalance, so the addition of the conjugate of multiplications and it is able to significantly outperform all
processing to the memory polynomial only had the effect of polynomial DPDs as the number of multiplications increases.
significantly increasing complexity. All DPDs were evaluated 2) In-band performance: Although the primary goal of
in terms of the adjacent channel leakage ratio (ACLR), the predistortion is to reduce spectral regrowth around the main
error vector magnitude (EVM), and the spectra of the post-PA carrier, predistortion also reduces the EVM of the main signal.
pre-distorted signals. Reducing EVM can improve reception quality and is hence a
1) Out-of-band performance: To measure the out-of-band desirable result. The EVM is computed as
performance, which is often the metric of most interest given
by Federal Communications Commission (FCC) regulations ŝ − s
EVM = × 100%, (11)
and 3GPP standards, we compute the ACLR shown below as s
Padjacent where s is the vector of all original symbols mapped onto
ACLR = 10 log10 , (10)
Pchannel complex constellations on OFDM subcarriers in the frequency

299

Authorized licensed use limited to: CMU Libraries - library.cmich.edu. Downloaded on July 08,2020 at 03:53:10 UTC from IEEE Xplore. Restrictions apply.
domain, ŝ is the corresponding received vector after passing Weights and
through the PA, and · represents the 2 norm. Biases RAM
In Fig. 5, we see the EVM versus the number of mul-
tiplications for each of the predistorters. As the number of (x[n]) PE PE Add (x̂[n])
multiplications increases, the EVM decreases, as expected.
The memoryless polynomial DPD is able to achieve a low (x[n]) PE PE Add (x̂[n])
EVM for the smallest number of multiplications. However,

...
the complexity is only slightly higher for the NN based DPD,
which is able to achieve an overall better performance than all PE
other examined polynomial DPDs.
Linear Bypass
3) Spectrum Comparison: The spectrum for both the mem-
Pipeline Registers
ory polynomial and the NN DPDs are shown in Fig. 6. Here,
both predistorters have the same running complexity of 80
multiplications per time-domain input sample. However, the
Figure 7. General structure of the NN FPGA implementation.
NN is able to provide an additional 2.8 dB of suppression at
±20 MHz.
ReLU
(x[n]) Multiply Add h1,i (n)
IV. FPGA A RCHITECTURE OVERVIEW Mux
In this section, we compare a NN DPD accelerator with Weights
a memory polynomial based implementation. We implement From RAM Add
Cache
both designs in Xilinx System Generator and target for the
Zynq UltraScale+ RFSoC ZCU1285 evaluation board. For the (x[n]) Multiply
sake of this architecture comparison, we implement each to
be fully parallelized and pipelined as to compare the highest
throughput implementations of each. Based on the previous Figure 8. Example structure of a PE for the ith neuron in hidden layer 1.
analysis, we implement both with 16-bit ﬁxed point precision
throughout.
We synthesize FPGA designs targeting two separate to that parameter. These registers output to the corresponding
ACLRs. First, we target an ACLR of approximately -31.4 dB. multiplier or adder.
This target is achieved with a NN with N = 6 neurons and An example neuron PE is shown in Fig. 8. Each PE
K = 1 hidden layer and a 7th order memoryless polynomial. is implemented with a sufﬁcient number of multipliers for
Second, we target a more aggressive ACLR below -32 dB. performing the multiplication of the weights by the inputs in
This is done with a NN with N = 14 neurons and K = 1 parallel. The results from each multiplier are added together,
hidden layer. A memory polynomial with M = 2 and P = 11 along with the bias and passed to the ReLU activation function,
is also used to achieve this. which is implemented with a single multiplexer.

A. Neural Network Accelerator B. Polynomial Accelerator

We implement the NN-DPD on FPGA with the goal of The memory polynomial is also implemented using 16 bits
realizing high throughput via maximum parallelization and throughout the design. We target the design for maximum
pipeling. The top-level overview of the design is shown in throughput by fully parallelizing and pipelining it so that a
Fig. 7. Here, each wire corresponds to a 16-bit bus. The real new time-domain input sample can streamed in each clock
and imaginary parts of the PA input signal stream in each clock cycle. The main overall structure of the design is shown in
cycle. Weights are stored in a RAM which can be written to Fig. 9. Each polynomial “branch” of the memory polynomial
from outside the FPGA design. After the RAM is loaded, the corresponding to nonlinear order p computes x(n)|x(n)|p−1
weights and biases are written to individual registers in the and there is a branch for each p in the design. This computation
neuron processing elements (PEs) which cache them for fast from each branch is passed to an FIR filter with complex taps.
access during inference. A chain of pipeline registers pass the Three multiplications are used for each complex multiplication
inputs to the output to be added to the output of the final layer. in each filter. A RAM is implemented to interface with some
After the weights are loaded into RAM, the RAM controller outside controller for receiving updated weights. Once the
loads each of the weights into a weights cache in each PE. coefficients α and β are loaded into the design, they can be
To do this, a counter increments through each address in moved from the RAM to registers near each multiply similarly
the RAM. The current address and the value at that address to the cache implemented in the NN design.
are broadcast to all neurons. Each address corresponds with
a specific weight or bias. Whenever the weights cache in C. Results
a neuron reads addresses corresponding to the weights and The Xilinx Vivado post-place-and-route utilization results
biases for its neuron, it saves the data into a register dedicated are shown in Table I. Overall, the NN-based design offers

300

Authorized licensed use limited to: CMU Libraries - library.cmich.edu. Downloaded on July 08,2020 at 03:53:10 UTC from IEEE Xplore. Restrictions apply.
DPD Coeffs. also be further reduced with pruning, and the accuracy could
RAM potentially be improved with retraining after quantization and
pruning.
FIR R EFERENCES
x[n] Pipeline Delays Sum x̂[n]
Filter [1] E. G. Larsson, O. Edfors, F. Tufvesson, and T. L. Marzetta, “Massive
MIMO for next generation wireless systems,” IEEE Commun. Mag.,
FIR vol. 52, no. 2, pp. 186–195, Feb. 2014.
x(n)|x(n)|2 [2] W. Roh et al., “Millimeter-wave beamforming as an enabling technology
Filter for 5G cellular communications: Theoretical feasibility and prototype
results,” IEEE Commun. Mag., vol. 52, no. 2, pp. 106–113, Feb. 2014.
...

...
[3] A. Zhu, M. Wren, and T. J. Brazil, “An efficient Volterra-based behav-
FIR ioral model for wideband RF power amplifiers,” in IEEE MTT-S Int.
x(n)|x(n)|P −1 Microw. Symp. Digest, vol. 2, June 2003, pp. 787–790 vol.2.
Filter
[4] L. Anttila, P. Handel, and M. Valkama, “Joint mitigation of power
amplifier and I/Q modulator impairments in broadband direct-conversion
transmitters,” IEEE Trans. Microw. Theory Techn., vol. 58, no. 4, pp.
Figure 9. General structure of the high-throughput, low-latency, memory 730–739, Apr. 2010.
polynomial FPGA implementation. [5] A. Katz, J. Wood, and D. Chokola, “The Evolution of PA Linearization:
From Classic Feedforward and Feedback Through Analog and Digital
Predistortion,” IEEE Microw. Mag., vol. 17, no. 2, pp. 32–40, Feb. 2016.
Table I [6] A. Balatsoukas-Stimming, A. C. M. Austin, P. Belanovic, and A. Burg.,
C OMPARISON OF P ERFORMANCE AND FPGA U TILIZATION “Baseband and RF hardware impairments in full-duplex wireless sys-
tems: experimental characterisation and suppression,” EURASIP Journal
ACLR: -31.4 dB ACLR: -32 on Wireless Commun. and Networking, vol. 2015, no. 142, 2015.
N =6 P =7 N = 14 P = 11 [7] D. Korpi, L. Anttila, and M. Valkama, “Nonlinear self-interference can-
Metric
K=1 M =1 K=1 M =2 cellation in MIMO full-duplex transceivers under crosstalk,” EURASIP
Num. of Params. 32 8 72 24 Journal on Wireless Comm. and Netw., vol. 2017, no. 1, p. 24, Feb.
LUT 379 539 688 1424 2017.
LUTRAM 16 120 16 224 [8] D. Zhou and V. E. DeBrunner, “Novel adaptive nonlinear predistorters
FF 538 991 1170 2730 based on the direct learning algorithm,” IEEE Trans. on Signal Process-
DSP 24 27 56 66 ing, vol. 55, no. 1, pp. 120–133, Jan. 2007.
Worst Neg. Slack (ns) 8.72 8.68 8.49 8.34 [9] R. N. Braithwaite, “A comparison of indirect learning and closed loop
Max. Freq. (MHz) 783 756 661 603 estimators used in dgital predistortion of power amplifiers,” in IEEE
Max. T/P (MS/s) 783 756 661 603 MTT-S Int. Microw. Symp., May 2015, pp. 1–4.
Latency (CC) 12 21 14 26 [10] K. Li et al., “Mobile GPU accelerated digital predistortion on a software-
defined mobile transmitter,” in IEEE Global Conf. on Signal and Inform.
Process. (GlobalSIP), Dec. 2015, pp. 756–760.
[11] M. Younes, O. Hammi, A. Kwan, and F. M. Ghannouchi, “An accurate
numerous advantages over the memory polynomial. Specifi- complexity-reduced “PLUME” model for behavioral modeling and digi-
cally, for the target of an ACLR less than -32 dB, the NN tal predistortion of RF power amplifiers,” IEEE Trans. on Ind. Electron.,
requires 48% of the lookup tables (LUTs), 42% of the flip- vol. 58, no. 4, pp. 1397–1405, Apr. 2011.
[12] M. Jain et al., “Practical, real-time, full duplex wireless,” in Proc. Int.
flops (FFs), and 15% reduction in the number of digital signal Conf. on Mobile Comput. and Netw. ACM, 2011, pp. 301–312.
processors (DSPs). In terms of timing, there is a 9.6% increase [13] M. Duarte, C. Dick, and A. Sabharwal, “Experiment-driven characteri-
in throughput with a 46% decrease in latency. These reductions zation of full-duplex wireless systems,” IEEE Trans. Wireless Commun.,
vol. 11, no. 12, pp. 4296–4307, Dec. 2012.
in utilization occur while also seeing improved ACLR. [14] D. Bharadia, E. McMilin, and S. Katti, “Full duplex radios,” in ACM
SIGCOMM, 2013, pp. 375–386.
V. C ONCLUSIONS [15] A. Balatsoukas-Stimming, “Non-linear digital self-interference cancel-
lation for in-band full-duplex radios using neural networks,” in IEEE
In this paper, we explored the complexity/performance Int. Workshop on Signal Processing Advances in Wireless Commun.
tradeoffs for a novel, NN based DPD and found that the NN (SPAWC), June 2018, pp. 1–5.
[16] Y. Kurzo, A. Burg, and A. Balatsoukas-Stimming, “Design and im-
could outperform memory polynomials and offered overall plementation of a neural network aided self-interference cancellation
unrivaled ACLR and EVM performance. Furthermore, we scheme for full-duplex radios,” in Asilomar Conf. on Signals, Systems,
implemented each on an FPGA and found that the regular and Comput., Oct. 2018, pp. 589–593.
[17] C. Tarver, L. Jiang, A. Sefidi, and J. Cavallaro, “Neural network DPD
matrix multiply structure in the NN based predistorter led to via backpropagation through a neural network model of the PA,” in
a lower latency design with less hardware utilization when Asilomar Conf. on Signals, Systems, and Comput., (to appear).
compared to a similarly performing polynomial-based DPD. [18] K. Hornik, “Approximation capabilities of multi-
layer feedforward networks,” Neural Networks, vol. 4,
This work opens up many avenues for future work. no. 2, pp. 251 – 257, 1991. [Online]. Available:
This work can be extended to also compare perfor- http://www.sciencedirect.com/science/article/pii/089360809190009T
mance/complexity tradeoffs for more devices with a wider [19] R. Hongyo, Y. Egashira, T. M. Hone, and K. Yamaguchi, “Deep neural
network-based digital predistorter for doherty power amplifiers,” IEEE
variety of signals, including different bandwidths and multiple Microw. and Wireless Compon. Letters, vol. 29, no. 2, pp. 146–148, Feb.
component carriers. It is also possible to include memory 2019.
cells such as recurrent neural networks (RNNs) in the NN to [20] M. Rawat and F. M. Ghannouchi, “Distributed spatiotemporal neural
network for nonlinear dynamic transmitter modeling and adaptive digital
account for memory effects. The NN is naturally well suited predistortion,” IEEE Trans. Instrum. Meas., vol. 61, no. 3, pp. 595–608,
for a GPU implementation which would be interesting in soft- Mar. 2012.
ware defined radio (SDR) systems. The NN complexity could [21] “RF WebLab.” [Online]. Available: http://dpdcompetition.com/rfweblab/

301

Authorized licensed use limited to: CMU Libraries - library.cmich.edu. Downloaded on July 08,2020 at 03:53:10 UTC from IEEE Xplore. Restrictions apply.

Neural Network DPD Via Backpropagation Through A NN
No ratings yet
Neural Network DPD Via Backpropagation Through A NN
5 pages
1 s2.0 S0167926024001494 Main
No ratings yet
1 s2.0 S0167926024001494 Main
4 pages
GaN Power Amplifier Digital Predistortion by Multi
No ratings yet
GaN Power Amplifier Digital Predistortion by Multi
12 pages
Navid Lashkarian, Signal Processing Division, Xilinx Inc., San Jose, USA, Chris Dick, Signal Processing Division, Xilinx Inc., San Jose, USA
No ratings yet
Navid Lashkarian, Signal Processing Division, Xilinx Inc., San Jose, USA, Chris Dick, Signal Processing Division, Xilinx Inc., San Jose, USA
6 pages
Science China Information Sciences: Wanzhi MA, Xin Quan, Bo Zhao, Ying LIU, Wensheng PAN, Shihai SHAO & Youxi TANG
No ratings yet
Science China Information Sciences: Wanzhi MA, Xin Quan, Bo Zhao, Ying LIU, Wensheng PAN, Shihai SHAO & Youxi TANG
9 pages
Reducing Power Consumption of Digital Predistortion For RF Power Amplifiers Using Real-Time Model Switching
No ratings yet
Reducing Power Consumption of Digital Predistortion For RF Power Amplifiers Using Real-Time Model Switching
9 pages
FPGA-Based Fix-Point DPD for PAs
No ratings yet
FPGA-Based Fix-Point DPD for PAs
4 pages
Convolutional Neural Network For Behavioral
No ratings yet
Convolutional Neural Network For Behavioral
15 pages
V4i1 1327
No ratings yet
V4i1 1327
4 pages
Jueschke 2017
No ratings yet
Jueschke 2017
7 pages
A Circuit-Inspired Digital Predistortion of Supply Network Effects For Capacitive RF-DACs
No ratings yet
A Circuit-Inspired Digital Predistortion of Supply Network Effects For Capacitive RF-DACs
13 pages
Boosted Model Tree-Based Behavioral Modeling For Digital Predistortion of RF Power Amplifiers
No ratings yet
Boosted Model Tree-Based Behavioral Modeling For Digital Predistortion of RF Power Amplifiers
13 pages
2645 10481 2 PB
No ratings yet
2645 10481 2 PB
14 pages
Thesis Title: Digital Signal Processing of Communication Signal With Python: DPD
No ratings yet
Thesis Title: Digital Signal Processing of Communication Signal With Python: DPD
47 pages
Ucalgary 2014 Rawat Piyush
No ratings yet
Ucalgary 2014 Rawat Piyush
83 pages
Rebai 2006
No ratings yet
Rebai 2006
4 pages
2006 A Generalized Memory Polynomial Model For Digital Predistortion of RF Power Amplifiers
No ratings yet
2006 A Generalized Memory Polynomial Model For Digital Predistortion of RF Power Amplifiers
9 pages
Digital Predistortion (DPD) Algorithm For 5g Applications
No ratings yet
Digital Predistortion (DPD) Algorithm For 5g Applications
17 pages
Digital Pre-Distortion (DPD) Concept
No ratings yet
Digital Pre-Distortion (DPD) Concept
3 pages
Low-Complexity Digital Predistortion of Concurrent Multiband RF Power Amplifiers - 045406
No ratings yet
Low-Complexity Digital Predistortion of Concurrent Multiband RF Power Amplifiers - 045406
10 pages
Audio Signal Processing by Neural Networks
No ratings yet
Audio Signal Processing by Neural Networks
34 pages
Electronics 12 02869 v2
No ratings yet
Electronics 12 02869 v2
19 pages
Audio Signal Processing by Neural Networks
No ratings yet
Audio Signal Processing by Neural Networks
34 pages
Digital Predistortion of Wideband Signals Based On Power Amplifier Model With Memory
No ratings yet
Digital Predistortion of Wideband Signals Based On Power Amplifier Model With Memory
2 pages
Compositional Neural-Network Modeling of Complex Analog Circuits
No ratings yet
Compositional Neural-Network Modeling of Complex Analog Circuits
8 pages
Neural Network Based DPD
No ratings yet
Neural Network Based DPD
21 pages
(@@ANN进行行为模型调整，有转化)
No ratings yet
(@@ANN进行行为模型调整，有转化)
4 pages
Power Amplifier Modeling and Power Amplifier Predistortion in Ofdm System
No ratings yet
Power Amplifier Modeling and Power Amplifier Predistortion in Ofdm System
10 pages
A Highly-Efficient Multi-Band Multi-Mode Digital Quadrature Transmitter With 2D Pre-Distortion
No ratings yet
A Highly-Efficient Multi-Band Multi-Mode Digital Quadrature Transmitter With 2D Pre-Distortion
4 pages
Backpropagation-Free Training of Deep Physical Neural Networks
No ratings yet
Backpropagation-Free Training of Deep Physical Neural Networks
44 pages
ds811 - dpd-LogiCORE IP Digitlal Pre-Distortion v4
No ratings yet
ds811 - dpd-LogiCORE IP Digitlal Pre-Distortion v4
45 pages
Digital Predistortion For RF Communication
100% (1)
Digital Predistortion For RF Communication
5 pages
Lic Thesis 18 - Digital Predistrotion For The Linearization of Power Amplifiers
No ratings yet
Lic Thesis 18 - Digital Predistrotion For The Linearization of Power Amplifiers
63 pages
Behavioral Modeling and Digital Pre Distortion Techniques For RF Pas in A 3 3 Mimo System
No ratings yet
Behavioral Modeling and Digital Pre Distortion Techniques For RF Pas in A 3 3 Mimo System
11 pages
ACES Journal May 2010 Paper 04 PDF
No ratings yet
ACES Journal May 2010 Paper 04 PDF
10 pages
Power Amplifier
No ratings yet
Power Amplifier
66 pages
Navigation Signal Radio Frequency Channel Modeling
No ratings yet
Navigation Signal Radio Frequency Channel Modeling
11 pages
Deep Learning in Communication Systems
No ratings yet
Deep Learning in Communication Systems
10 pages
Joint Transceiver Optimization For Wireless Communication PHY With CNN
No ratings yet
Joint Transceiver Optimization For Wireless Communication PHY With CNN
21 pages
An Introduction To Deep Learning For The Physical Layer
No ratings yet
An Introduction To Deep Learning For The Physical Layer
13 pages
Deep Learning Based Optimization in Massive MIMO Systems
No ratings yet
Deep Learning Based Optimization in Massive MIMO Systems
34 pages
Novel Frequency Dependent Neural Network Behavioural Modelofa Power Amplifier
No ratings yet
Novel Frequency Dependent Neural Network Behavioural Modelofa Power Amplifier
4 pages
Neural Network Learning For Analog VLSI Implementations of Support Vector Machines: A Survey
No ratings yet
Neural Network Learning For Analog VLSI Implementations of Support Vector Machines: A Survey
19 pages
Digital Signal Processing Using Deep Neural Networks: Brian Shevitski, Yijing Watkins, Nicole Man and Michael Girard
No ratings yet
Digital Signal Processing Using Deep Neural Networks: Brian Shevitski, Yijing Watkins, Nicole Man and Michael Girard
21 pages
Compensating The Non-Linear Distortions of An OFDM
No ratings yet
Compensating The Non-Linear Distortions of An OFDM
5 pages
2009 181285 PDF
No ratings yet
2009 181285 PDF
10 pages
Deep Learning Models For Physical Layer Communications
No ratings yet
Deep Learning Models For Physical Layer Communications
263 pages
MIMO Detection with Model-Driven DL
No ratings yet
MIMO Detection with Model-Driven DL
14 pages
An Overview of RF Power Amplifier Digital Predistortion Techniques For Wireless Communication Systems
No ratings yet
An Overview of RF Power Amplifier Digital Predistortion Techniques For Wireless Communication Systems
6 pages
A Robust Digital Baseband Predistorter Constructed Using Memory Polynomials
No ratings yet
A Robust Digital Baseband Predistorter Constructed Using Memory Polynomials
7 pages
Icmlc 2009 5212326
No ratings yet
Icmlc 2009 5212326
5 pages
MIMO Precoding with Deep Learning
No ratings yet
MIMO Precoding with Deep Learning
6 pages
Large Margin Multi Channel Analog To Digital Conversion With Applications To Neural Prosthesis
No ratings yet
Large Margin Multi Channel Analog To Digital Conversion With Applications To Neural Prosthesis
8 pages
EScholarship UC Item 1rv3f2h4
No ratings yet
EScholarship UC Item 1rv3f2h4
67 pages
Application of Neural Network To Identify Black Box Model of Twin Rotor MIMO System Based On Mean Squared Error Method
No ratings yet
Application of Neural Network To Identify Black Box Model of Twin Rotor MIMO System Based On Mean Squared Error Method
6 pages
Electronics Lab EDC
No ratings yet
Electronics Lab EDC
14 pages
Autoencoders Based Digital Communication Systems
No ratings yet
Autoencoders Based Digital Communication Systems
5 pages
Distance Transmission Line Protection Based On Radial Basis Function Neural Network
No ratings yet
Distance Transmission Line Protection Based On Radial Basis Function Neural Network
4 pages
Neural Network Predistortion Tech For RF Amplifiers
No ratings yet
Neural Network Predistortion Tech For RF Amplifiers
168 pages
WWP - The 15 Laws of Mental Supremacy
No ratings yet
WWP - The 15 Laws of Mental Supremacy
17 pages
B.bachelor of Science in Real Estate Management
100% (1)
B.bachelor of Science in Real Estate Management
11 pages
54 Django Questions Answers
No ratings yet
54 Django Questions Answers
86 pages
Project Management-1
No ratings yet
Project Management-1
11 pages
Through The Tunnel
No ratings yet
Through The Tunnel
13 pages
ch01 1
No ratings yet
ch01 1
104 pages
Dastrup 2021b Chapter 2 Final
No ratings yet
Dastrup 2021b Chapter 2 Final
153 pages
.Electricity in 19th Century Medicine and Mary Shelley's Frankenstein..
No ratings yet
.Electricity in 19th Century Medicine and Mary Shelley's Frankenstein..
3 pages
Admit Card SAIL
No ratings yet
Admit Card SAIL
3 pages
Cognitive Stimulation Using Non-Immersive Virtual Reality Tasks in Children With Learning Disabilities.
No ratings yet
Cognitive Stimulation Using Non-Immersive Virtual Reality Tasks in Children With Learning Disabilities.
159 pages
Ultrasonic Tech in Food & Environment
No ratings yet
Ultrasonic Tech in Food & Environment
7 pages
Room Cleanliness
No ratings yet
Room Cleanliness
8 pages
1.recessed Luminaires
No ratings yet
1.recessed Luminaires
24 pages
SQP 24-25
No ratings yet
SQP 24-25
8 pages
04-MathematicalReference TRNSYS
No ratings yet
04-MathematicalReference TRNSYS
474 pages
Inquiry Investigation-No Journal
No ratings yet
Inquiry Investigation-No Journal
14 pages
Meggers e Evans 1961 An Experimental Formulation of Horizon Styles in The Tropical Forest Area of South America.
No ratings yet
Meggers e Evans 1961 An Experimental Formulation of Horizon Styles in The Tropical Forest Area of South America.
17 pages
English Language Learners Quiz
No ratings yet
English Language Learners Quiz
24 pages
Financial Sector Development and Economic Growth in Ethiopia
No ratings yet
Financial Sector Development and Economic Growth in Ethiopia
11 pages
VOCABULARY MAGAZINE JUNE 2024 Edition (By The R.H.A)
No ratings yet
VOCABULARY MAGAZINE JUNE 2024 Edition (By The R.H.A)
65 pages
Introduction To Embedded Systems Report
No ratings yet
Introduction To Embedded Systems Report
7 pages
Chapter-3 P-2 Drugs and Cosmetics Act 1940 - Manufacture of Drugs
No ratings yet
Chapter-3 P-2 Drugs and Cosmetics Act 1940 - Manufacture of Drugs
10 pages
Map Coloring for Beginners
No ratings yet
Map Coloring for Beginners
11 pages
Sheet Metal Thickness Gauges
No ratings yet
Sheet Metal Thickness Gauges
2 pages
PDF Portuguese Starter E-Book
No ratings yet
PDF Portuguese Starter E-Book
13 pages
Philippine Traditional Games Plan
0% (1)
Philippine Traditional Games Plan
5 pages
Integumentary System
100% (1)
Integumentary System
40 pages
Traveller - Solo Adventure 1 - Scout's Honor
No ratings yet
Traveller - Solo Adventure 1 - Scout's Honor
26 pages
Mixed Species Exhibition of Neotropical
No ratings yet
Mixed Species Exhibition of Neotropical
14 pages
SS GoingGroceryShopping
No ratings yet
SS GoingGroceryShopping
2 pages

Tarver 2019

Uploaded by

Tarver 2019

Uploaded by

2019 IEEE International Workshop on Signal Processing Systems

Design and Implementation of a Neural Network

Email: ∗ tarver@rice.edu, † a.k.balatsoukas.stimming@tue.nl, § cavallar@rice.edu

978-1-7281-1927-4/19/$31.00 ©2019 IEEE 296

NN MSE Training Loss tion, we get the following number of multiplications

0.01 cation. The expression, x(n)|x(n)|p−1 is computed once for

hi (n) = f (Wi hi−1 (n) + bi ) , i = 2, . . . , K, (6)

x̂(n) = z1 (n) + 1j · z2 (n), (8)

0 20 40 60 80 100 120 0 20 40 60 80 100 120

signals commonly used in cellular deployments. It provides

A. Neural Network Accelerator B. Polynomial Accelerator

You might also like