0% found this document useful (0 votes)
6 views11 pages

Fir Da

This paper presents a low-complexity pipelined adaptive FIR filter designed using Distributed Arithmetic (DA) architecture, aimed at reducing area and power consumption for signal processing applications. The proposed design utilizes compressor adders and a pipelined structure to achieve a 30% reduction in area and 25% reduction in power compared to existing architectures. The implementation is coded in Verilog HDL and synthesized using SAED 90 nm technology, demonstrating suitability for applications such as adaptive decision feedback equalizers and hearing aids.

Uploaded by

johnbuchi1234
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views11 pages

Fir Da

This paper presents a low-complexity pipelined adaptive FIR filter designed using Distributed Arithmetic (DA) architecture, aimed at reducing area and power consumption for signal processing applications. The proposed design utilizes compressor adders and a pipelined structure to achieve a 30% reduction in area and 25% reduction in power compared to existing architectures. The implementation is coded in Verilog HDL and synthesized using SAED 90 nm technology, demonstrating suitability for applications such as adaptive decision feedback equalizers and hearing aids.

Uploaded by

johnbuchi1234
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Microprocessors and Microsystems 93 (2022) 104577

Contents lists available at ScienceDirect

Microprocessors and Microsystems


journal homepage: www.elsevier.com/locate/micpro

Low power low area VLSI implementation of adaptive FIR filter using DA for
decision feed back equalizer
K. Vijetha ∗, Rajendra Naik B.
University College of Engineering Osmania University, Hyderabad, Telangana, India

ARTICLE INFO ABSTRACT

Keywords: In this paper, a low-complexity pipelined adaptive FIR filter is designed using Distributed Arithmetic (DA)
Pipelined architecture for signal processing applications. Generally adaptive filters will occupy more area and power
LMS adaptive FIR filter consumption because of using memories in the filters for partial product (PP) generation. To get rid of this,
Distributed Arithmetic
We use the pipeline concept to reduce the registers in the filters and also to reduce the area further compressor
Compressor adder
adders are used in the adaptive filter architectures instead of using normal adders. With these two concepts the
area and power consumption of the adaptive filters will be reduced. The proposed design is coded in Verilog
HDL language and synthesized in Synapsis design compiler tool with SAED 90 nm technology for finding the
area, power, minimum sampling period, maximum sampling frequency, area delay product (ADP), power delay
product (PDP). By using proposed adaptive filter we can design and implement higher order filters more easily
and also the complexity of the proposed design is very less when compared with the existing designs. When
we observe the synthesis results the proposed design will occupy 30% less area when compared with the two
memories based existing architecture. Also the power consumption is 25% less when compared with the block
based adaptive filters. The ADP and PDP of the proposed design is very less when compared with existing
architectures. The proposed design is well suited for signal processing application designs such as adaptive
decision feed back equalizers for removing the signal noises and inter symbol interference, hearing aids, ECG
signal analysis and software defined radio.

1. Introduction and adders are commonly used in any filters. Multipliers will take up
more space and use more energy.
In signal processing applications such as echo cancellation [1], Multiplier architectures are replaced by multiplier-less architectures
software defined radio, digital communications an adaptive filter, is in adaptive FIR filters to save area and power consumption [3]. To
essential since only a limited amount of a priori knowledge of signal enhance the speed and reduce power, a number of multipliers-less
properties is available. Fig. 1 depicts the fundamental schematic of adaptive FIR filter topologies are available. The Distributed Arithmetic
an adaptive filter. It consist of multipliers, delay elements and adders. model [4,5], is a well-known multiplier-free design for more effi-
When we observe the architecture multipliers will occupy more area cient adaptive filter implementation. Croisier [6] was the first author
and power consumption. The linear combiner, which forms the filter to propose DA concept, and further it was enhanced developed by
output 𝑦, remains at the centre of the architecture. The adaptive filters Peled [7]. Zohar [8] was in charge of the mathematical document
will change the filter coefficients to bring them closer to their ideal work. White [9] proposed the formulation of basic DA. NagaJyothi [10]
values. The filter coefficient can be updated with a variety of adaptive discussed various types of DA-based FIR filters. Allied suggested two
techniques [2]. FIR and IIR are two types of adaptive filters. Because of adaptive DA-based FIR filters based on LUTs [11]. Guo suggested an
their natural stability and ease of computation, adaptive FIR filters are adaptive FIR filter using DA [12], however it is only appropriate for
more advantageous than IIR filters. Adaptive filters come in a variety of FIR filters of lower order. Jyothi presented a DA-based adaptive FIR
shapes and sizes. Least mean square adaptive design, Recursive adap- filter without memory [13]. A traditional multiplier consists of two
tive design, Normalized adaptive filter are the three main algorithms major steps: The key topics of this study effort were multiplier less
present. Because filter coefficients may be quickly adjusted, the least architecture, implications, and solutions to better adaptive FIR filter
mean square adaptive method is a better choice. Multipliers, shifters topologies [14]. A tutorial survey on DA has been proposed by GN

∗ Corresponding author.
E-mail address: vijethakura@gmail.com (K. Vijetha).

https://doi.org/10.1016/j.micpro.2022.104577
Received 5 November 2021; Received in revised form 26 April 2022; Accepted 3 June 2022
Available online 11 June 2022
0141-9331/© 2022 Elsevier B.V. All rights reserved.
K. Vijetha and Rajendra Naik B. Microprocessors and Microsystems 93 (2022) 104577

Fig. 1. Basic adaptive filter.

Fig. 2. DA based 4-tap adaptive filter.

Jyothi which is useful for the DSP applications. Many multipliers less architectures have a second clock period. The use of two separate clocks
LUT based and LUT less DA architecture has been discussed. Also in this reduces power consumption. When compared to a traditional DA-based
author has explained how multiplier less architecture is more useful design, this architecture reduces the number of adders by half. This
than the multiplier based architecture. architecture also allows for LUT updating in parallel, as well as filtering
Decision feedback equalizers, hearing aids, channel equalizers, noise and coefficient update in parallel. Baghel and Shaik [21] developed a
and echo cancellation, and beam shaping are all examples of signal DA-based solution for implementing a block LMS adaptive filter on an
processing applications that use adaptive filters [15]. FPGA. For the block LMS (BLMS) adaptive filter.
Mohanty et al. [22] developed a DA-based construction. A set of
2. Literature survey
LUTs, input samples, and coefficient updates are used in this structure.
They developed a new LUT update mechanism that does not move the
Allred et al. [16,17] developed LMS adaptive filter design using DA.
LUT content. Left shifting weight vectors over an array of LUTs achieves
The author had taken 2 unique LUTs for filter output & coefficient
column-wise right shift of LUTs from one LUT to the other LUT [22].
update. The architecture is made up of three sections. The DA filter is
This method saves both time and power. LUT optimization is impossible
the first module, and it conducts filtering operations on both the current
with this strategy. Mohanty offer a LUT optimized design for DA based
data sets and the LUT’s existing weights.
Guo and De Brunner [18,19] proposed a useful approach for storing BLMS adaptive filter.
the sums of delayed and scaled signals in the LUT that uses coefficients Tang et al. has been explained FIR filter using booth encoding mul-
as addresses. With this technique, both the updating of the LUT and the tiplier and Wallac tree adder for decreasing the area and power. Chen
execution of the coefficients can be done at the same time. They also and Chiueh discussed CSD based reconfigurable FIR filter where the fil-
devised a method for saving data in the LUT in OBC format rather than ter coefficients can be changed during simulation. The other multipliers
binary. less conversion based methods are the Common sub-expression elimi-
Meher et al. [20] developed a DA-based adaptive filter. Carry save nations (CSE), programmable shift method and constant shift method.
accumulation replaces binary DA’s adder-shift accumulation. It has 2 In this the filter coefficient inputs are mentioned in binary form. But
separate clock periods.CSA has one clock and all other circuits in the these are useful for the lower order filters [23].

2
K. Vijetha and Rajendra Naik B. Microprocessors and Microsystems 93 (2022) 104577

Fig. 3. Inner product block.

Fig. 4. Partial Product outputs of DA tables.

Prakash [24] proposed a new OBC DA-based FIR filter technique. high rate of throughput. Jiang et al. [25] described an energy-efficient
There are two LUTs in the design. The offset binary input samples are FIR adaptive filter based on approximate DA design. To decrease the
kept in one LUT, while the modified filter coefficients are stored in amount of partial products (PP) in the filter, the radix-8 booth multi-
another. The parallel up-dation of input samples and filter coefficients plier is used. To speed up the design, a Wallac tree adder is employed
can be accomplished using two LUTs. As a result, the design has a instead of a regular adder.

3
K. Vijetha and Rajendra Naik B. Microprocessors and Microsystems 93 (2022) 104577

Fig. 5. Weight increment block.

Fig. 6. Proposed DA table for partial product generation.

In signal processing applications where only a rudimentary under- Berberidis et al. developed a new block adaptive DFE in [26], which
standing of signal qualities is available, such as noise cancellation, is mathematically identical to the traditional LMS-based sample-by-
channel equalizers, and so on, an adaptive filter is needed. sample DFE but requires significantly less computational effort. Parhi

4
K. Vijetha and Rajendra Naik B. Microprocessors and Microsystems 93 (2022) 104577

The inner product equation is:



𝐾
𝑦= ℎ𝑘 𝑥 𝑘 (5)
𝑘=0

Where ℎ𝑘 = fixed coefficient,


𝑥𝑘 = input signal with K no. of input words.
The 2’s form of 𝑥𝑘 is:


𝐿−1
𝑥𝑘 = −𝑏𝑘0 + 𝑏𝑘𝑙 2−𝑙 (6)
𝑙=1

where 𝑥𝑘 = 𝑏𝑘0 , … ....𝑏𝑘(𝐿−1)


By substituting 𝑥𝑘 in Eq. 6 we get:
[ 𝑘 ]
∑ ∑
𝐿−1 ∑𝐾
𝑦= ℎ𝑘 𝑏𝑘𝑙 2−𝑙 + ℎ𝑘 (−𝑏𝑘0 ) (7)
𝑙=1 𝑘=1 𝑘=1

Eq. (7) is the final DA equation.

3.1. Existing design


Fig. 7. Proposed compressor adder.
As straight forward implementation of DA based architecture to
compute the inner products will have a much larger LUT, we must
[27] developed an adaptive DFE based on a multiplexer loop, where partition the larger architecture into numerous smaller ones in order to
look-ahead was used once more for performance improvements. Lin create the higher order adaptive filters. As a result, a 4-tap DA-based
et al. [28] offer a blind ADFE that uses parallel equalization blocks architecture with four and sixteen bit LMS adaptive filters as shown in
for both FFF and FBF and consumes a lot of power. In papers [29,30] Fig. 2. It contains DA block, weight increment block, adder shift unit.
authors described the DFE with time frequency equalization using MSK
signal analysis. 3.2. 4- Point inner product block
In this paper, we proposed novel pipelined DA based adaptive FIR
filter for adaptive DFE for reducing area and power consumption. To Fig. 3 shows the 4-point inner product block. It contains a DA
reduce the area further we proposed compressor adder than the existing table, 16:1 multiplexer, and conditional carry save accumulator. The
adder. DA table, which is made up of a 15 register array, stores the partial
The paper is organized as: The literature survey on adaptive filters inner products y. The register’s contents are selected using the 16:
is present in section.2. The mathematical computation of an adaptive 1 Multiplexer (MUX). The weight vector A = ℎ3𝑙 ℎ2𝑙 ℎ1𝑙 ℎ0𝑙 is used to
FIR filter employing Distributed Arithmetic is covered in Section 3. regulate the MUX, and the MUX output is sent to the conditional carry-
Section 4 describes the existing adaptive filter design and Section 5 de- save accumulator (CSA) after every L bit cycles. The CSA is used to
scribes proposed adaptive FIR filters. Section 6 discusses the synthesis shift and aggregate all partial inner products acquired from the MUX,
findings, including area, power, and ADP, and PDP. Finally, Section 7 as well as produce a total and a carry for each with a bit length
describes the conclusion of the paper. of (L + 2). To generate filter output, the sum words from the carry
save accumulator are shifted and then added with the carry words,
3. Adaptive filter algorithm and an input carry ‘‘1’’ is employed. The error signal e is obtained by
subtracting the filter’s result from the target signal w(n). Except for the
From Fig. 1 we can say, for every clock cycle the filter coefficient most significant bit, all of the bits of the error signal are evaluated,
can be updated in adaptive filter to find filter output and error value. and the error signal is multiplied by a right shift. The no. of sites to
The error value achieved is used for filter coefficient updating process. be moved will be determined by the magnitude of the error, which is
Input data 𝑋(𝑘) = dependent on the number of leading zeros present. We must execute
the control word ‘t’ present in the barrel shifter by leveraging that
𝑋(𝑘) = [𝑥(𝑘), … ...𝑥(𝑘 − 𝐾 + 1)]𝑇 (1) error. In error calculations, the convergence factor is commonly given
as O (1/N). In the current DA design, we have assumed it is = 1/N.
‘T’ = Transposed form.
In contrast, 2-i/N is used when ‘i’ is a tiny integer value. To reduce
Output signal 𝑌 (𝑘) = hardware complexity, ‘i’ places increase the amount of shifts in the t
𝑌 (𝑘) = ℎ(𝑘)𝑋 𝑇 (𝑘) (2) and ‘i’ places increase the input to the barrel shifters. Fig. 4 illustrates
the partial product DA table for 𝑁 = 4. It has seven parallel adders,
h(k) = filter coefficient vector each of which calculates seven new clock values in advance. It will
Weight updating of the LMS algorithm is : assist in reducing the no. of clock cycles needed to compute the sums of
input samples. It only has 15 registers to aggregate the pre-computed
ℎ(𝑘 + 1) = 𝜇𝑒(𝑘)𝑋(𝑘) + ℎ(𝑘) (3)
sums of partial products of the input words. It takes only four clock
Eq. (3) is the increment of filter coefficient Where 𝜇 = Step size e(k) = cycles to compute all 15 products for input bit data length of 8 bits.
error signal
3.3. DA table
𝑒(𝑘) = (𝐷(𝑘) − 𝑌 (𝑘)) (4)

𝐷(𝑘) = desired signal Fig. 4 shows a DA table with 16 delay elements that can create
𝑋(𝑘) = input signal partial results for the adaptive filter construction.

5
K. Vijetha and Rajendra Naik B. Microprocessors and Microsystems 93 (2022) 104577

Fig. 8. 16-tap proposed filter.

Fig. 9. Power report.

3.4. Carry save accumulator 4. Proposed compressor adder-based pipelined DA-based adaptive
FIR filter

Shift and accumulation can be used to compute the inner products The inner products of a basic DA-based adaptive FIR filter are
in L clock cycles, followed by LUT-read operations corresponding to L calculated using a DA table with registers and adders. In Fig. 3, the
number of bit slices dkl for 0 l L1. This shift accumulation is done with a DA table’s inner product is shown. The register has an input of length
CSA because the most crucial path is included in the shift accumulation. L and an output of length Y(k). To get the value x(k), x(k) is sent

6
K. Vijetha and Rajendra Naik B. Microprocessors and Microsystems 93 (2022) 104577

Fig. 10. Area report.

The input signal X(k) is passed to different 3 D-FFs and yields the
outputs of 𝑋(𝑘 − 3), 𝑋(𝑘 − 2), 𝑋(𝑘 − 1) signals which are passed to the
weight increment block (WIB) as shown in Fig. 5. It consist of 4 barrel
shift (BS) registers, 4 adder/subtractor, 4 registers and a word parallel
bit converters. The BS will shift the values according to the control
signal t. The control signal shifts the data one or more times according
to the condition.
The BS output will be added or subtracted depending upon the filter
coefficient values. The sign bit will control the adder/subtractor block
by using error bit as controller. If sign bit is ‘0’ the BS output will
be added with current filter coefficient data and if sign bit is ‘1’ then
BS output will be subtracted from the filter coefficient and are these
register values are send to the word parallel bit serial device to get
final output. This output will act as a selection line for the DA table
generated in the 4-point inner product block.
This process will be repeated until the error will get nullify and
which is noise will removed by using adaptive filter. In this process
for adding outputs we use compressor adders such that the number
of gates used for designing of adders will be reduced so that the area
consumption for full adder will be reduced further. In similar manner
for higher order filters the higher order filters are sub divided into
smaller order filters such that the large memory usage will be reduced.

Fig. 11. Layout. 4.1. Proposed compressor adder

The size of DA-based adaptive filters may be lowered by adopting


to yet another register to get x(k-1). To construct x(k-1)+x(k+1), the the proposed design, and to reduce the adders, we used a 3:2 compres-
inputs x(k) and x(k+1) are added and transferred via the register sor adder with two XOR gates and a 2:1 multiplexer. a, b, c are the
x(k). As a result, the registers and adders will produces outputs like inputs with sum and carry as outputs. The signal a, b are passed to
x(k-2)+x(k),x(k),x(k-2)+x(k-1)....., which will be delivered into a 16:1 first XOR gate, the output of the first XOR gate will passed as one put
multiplexer with filter coefficient as selection lines. To save space, we to the second XOR gate and c is the second input to the second XOR
gate. Both the signals are passed and get final sum. Where as for getting
created a one-of-a-kind pipelined DA table, as shown in Fig. 4.
carry a, c are passed as inputs to the 2:1 multiplexer with first XOR gate
Rather than providing x(k+1) input samples to the adders and
output signal act as a selection signal for the 2:1 mux to get carry as
registers every time, the proposed pipelined DA table architecture uses output. By using this method the number of gates are going to reduced
prior register samples recovered from the registers and returned as in adder gates which are shown in Fig. 7. As a result, the number of
inputs to the adder to accomplish the same DA table features. In the adders required decreases as the area decreases.
current design, there are 15 registers and 7 adders in total, requiring
more hardware; however, by adopting the proposed pipelined DA table, 4.2. Higher order filters
the number of registers required is reduced by having four more adders.
When compared to the adders, the registers will take up more space. In Let us consider 𝑁 = PQ, where Q = 2n, ‘n’ is the positive number
contrast to the 15 registers illustrated in Fig. 6, the suggested DA table and ‘P’ is the smaller adaptive filter block. The final proposed 16-
contains just four registers. tap adaptive FIR filter is shown in Fig. 8. It consist of 4 WIB blocks,

7
K. Vijetha and Rajendra Naik B. Microprocessors and Microsystems 93 (2022) 104577

Fig. 12. Proposed adaptive DFE.

Fig. 13. Simulation waveform for the ADFE Ref. [13].

4 pipelined DA block, 3 proposed compressor adder block. The 4- were listed in Table 1. In this the proposed and existing works are
WIB blocks will give 4 selection lines to the 4 different pipelined DA designed using SAED90 nm technology. The filter taps of 16, 32 and
blocks. The outputs from DA block will give to the 3 different proposed 64 of existing designs and proposed designs are shown in Table 1.
compressor blocks to get final output. This final output is cross verified When we observe the Table 1 the synthesis results of author Debrun-
by the desired signal and is again given to the WIB block. This process ner [11] has high power consumption and it has high complex structure
is repeated until errors will be nullified. when compared with the proposed design. Also the author Guo [12]
proposed architecture has high mathematical calculation and difficult
5. Simulation results to understand and the hardware complexity and area of the design is
high. The area occupied by the design is very high when compared with
The proposed design was programmed in Verilog HDL language and the proposed design. Ref. [25] has occupied more area when compared
synthesized in ASIC in Synapsis design compiler tool, and the results with proposed design. In Ref. [25] they used approximate DA adaptive

8
K. Vijetha and Rajendra Naik B. Microprocessors and Microsystems 93 (2022) 104577

Table 1
ASIC implementation results of proposed and existing results.
Design Order (K) MSP (ns) Speed Area μm2 Power ADP PDP (mwXns)
16 24.05 50.600 143377.333 1.52491 3441048 36.6738
Ref. [11] 32 27.24 46 267146.333 2.90620 7277057 79.1648
64 29.78 40 537053.633 5.3110 15993438 158.16158
16 25.80 77 107520.8 0.7555 2774016 12
Ref. [12] 32 31.80 71 209082 1.4598 6648807 46.42
64 35.80 64 416418 2.7315 14907770.49 97.7877
16 7.09 141.04 88719 2.917 629617.71 20.68153
Ref. [25] 32 8.31 120.33 177460 5.834 1474692.60 48.48054
64 9.6 104.167 364820 11.67 3502272.00 112.032
16 11.8 84.745 41946 0.56 494962.80 15.694
Proposed design 32 12.05 82.987 85262 1.0112 1027407.00 20.42475
64 13.11 76.278 170678 2.0127 2237588.58 42.6704

area = μm2 , ADP = area delay product, power = mW, MSP = Minimum Sampling Period, PDP = Power delay product (mWXns), Speed = MHZ.

Table 2
Hardware complexity of the proposed architectures.
Design Throughput Hardware
Adders Shifters Registers LUT
Alled et.al [11] 1/[c1(𝑇𝑅 + 𝑇𝐴 + 3𝑇𝑀 )] 1+3Q Q Q+K+2 2𝑃 +1 𝑄
Guo et.al [12] 1/c2(𝑇𝑅 + 𝑇𝐴 ) N+1+2Q K 3K+1+2Q 2𝑃 𝑄
Meher [22] 1/L(4𝑇𝑀 + 𝑇𝐹 𝐴 + 𝑇𝐷 ) (3.2P-1+1)Q Q(2P-1) (1+2P-1)3Q+K+1 *
Park [20] 1/L(4𝑇𝑀 + 𝑇𝐹 𝐴 + 𝑇𝐷 + +𝑇𝑋𝑂𝑅 ) (2+2P-1)Q+K K (3+2P)Q+3+2k *
Proposed 1/L(4𝑇𝑀 + 𝑇𝐷 + 𝑇𝐹 𝐴 ) 3Q K (3+2P)Q+3+2K *

𝑇𝑀 = Delay for 2:1 MUX; 𝑇𝑅 = Delay for LUT access and address generation; 𝑇𝐹 𝐴 = Delay for full adder; K = Filter order R = K/S; The no.
of adaptive filter blocks of length S; 𝑇𝑋𝑂𝑅 = Delay for XOR gate; 𝑇𝐷 = Delay for D-FF; 𝑇𝐴 = Delay for the bit adder.

filter which is used for specific application only. Around 50% area is
reduced when compared with Ref. [11] and 75% area is reduced when 𝑁𝑓 −1

compared with the Ref. [12]. When comparing the area delay product ̂ =
𝑥(𝑘) ℎ𝑓 (𝑖)𝑥(𝑘 − 𝑖) (11)
(ADP), power delays product (PDP), and minimum cycle period (MCP) 𝑖=0
𝑁𝑏 −1
of the two DA architectures the proposed design will occupy less power ∑
̂ =
𝑟(𝑘) ℎ𝑏 (𝑗)𝑟(𝑘 − 𝑗) (12)
consumption and is shown in Fig. 9. Fig. 10 shows the area report of the
𝑗=0
present and existing architectures. The hardware architecture occupied
by the proposed design is very less when compared with other existing where
architecture. By using pipeline concept the area occupied by DA table
ℎ𝑇𝑓 = [ℎ0 , ℎ1 , ℎ2 , … ..ℎ𝑁𝑓 −1 ] (13)
is decreased and the area will be reduced further by using compressor
adders. The layout diagram is shown in Fig. 11. Table 2 shows the
hardware complexity of the proposed and existing architectures. The ℎ𝑇𝑏 = [ℎ0 , ℎ1 , ℎ2 , … ..ℎ𝑁𝑏 −1 ] (14)
speed of the proposed design is high when compared with Alled et al. The coefficient of FF and FB filters are ℎ𝑓 and ℎ𝑏 . 𝑁𝑏 and 𝑁𝑓 are
and Guo et al. architectures. The number of adders required is 3Q which number of FB and FF filter coefficients respectively. 𝑥(𝑘) and 𝑟(𝑘) with
is less when compared with other architectures. Also proposed design 𝑊 word length, 2′ s complementary form is:
needs K shift registers.

𝑊 −1
𝑥(𝑘 − 𝑖) = 𝑥𝑖,𝑊 −1−𝑖 2−𝑖 − 𝑥𝑖,𝑊 −1 (15)
6. Application of proposed filter for DFE 𝑤=1


𝑊 −1
In digital communications, transmitted signal prone to distortion 𝑟(𝑘 − 𝑗) = 𝑟𝑗,𝑊 −1−𝑗 2−𝑗 − 𝑟𝑗,𝑊 −1 (16)
due to additive noise in channel and multi-path propagation and makes 𝑤=1
the signal less reliable. The distortion can occur at any instant of time. On substituting Eq. (11) and Eq. (12) in Eq. (9) respectively we get :
To nullify these distortions in transmitted signals we go for adaptive
𝑁𝑓 −1 𝑁𝑏 −1
decision feed back equalizers. ∑ ∑
𝑆(𝑘) = 𝑥(𝑘 − 𝑖)ℎ𝑓 (𝑖) − ℎ𝑏 (𝑗)𝑟(𝑘 − 𝑗) (17)
𝑖=0 𝑗=0
6.1. Mathematical formulation of DFE
𝑁𝑓 −1 𝑁𝑏 −1
∑ ∑
= 𝑥(𝑘 − 𝑖)ℎ𝑓 (𝑖) + −ℎ𝑏 (𝑗)𝑟(𝑘 − 𝑗) (18)
Let us consider, adaptive DFE shown in Fig. 12, with 𝑥(𝑘) as input 𝑖=0 𝑗=0
signal, where 𝑘 ∈ 𝑍 with 𝑁𝑓 number of FF filter coefficients & output 𝑁𝑓 −1 𝑁𝑏 −1
decision 𝑟(𝑘) with 𝑁𝑏 number of FB filter coefficients. 𝑆𝑞𝑘 is the output ∑ ∑
= 𝑥(𝑘 − 𝑖)ℎ𝑓 (𝑖) + ℎ𝑏̄(𝑗)𝑟(𝑘 − 𝑗) (19)
decision for DFE: 𝑖=0 𝑗=0

𝑆𝑞𝑘 = 𝑄[𝑆(𝑘)] (8) ∑


𝐾−1
= ℎ(𝑘)𝑧(𝑘 − 𝑛) (20)
where Q[.] = Quantization operation. 𝑛=0

̂ − 𝑟(𝑘)
𝑆(𝑘) = 𝑥(𝑘) ̂ (9) 𝑆(𝑘) = ℎ𝑇 𝑍 (21)

Where 𝐾 = 𝑁𝑓 + 𝑁𝑏 and
𝑟(𝑘) = 𝑆𝑞 (𝑘 − 1) (10) ̄ (𝑘), … .....ℎ𝑏,𝑁̄ (𝑘)]
ℎ𝑇 = [ℎ𝑓 ,0 (𝑘), … ...ℎ𝑓 ,𝑁𝑓 −1 (𝑘), ℎ𝑏,0 𝑏−1

9
K. Vijetha and Rajendra Naik B. Microprocessors and Microsystems 93 (2022) 104577

where 𝑙 = [0, 1, 2......𝐿 − 1]


Eq. (29) and Eq. (30) are the filter coefficient updating equations.
On substituting these equations in 𝑠(𝑘) finally we get
⎡𝑁∑𝑓 −1 ⎤ 𝐿−1
∑⎡ ∑
𝑁𝑓 −1 ⎤
𝑆(𝑘) = − ⎢ ℎ𝑓 𝑛 𝑏𝑛0 ⎥ + ⎢ 𝑏𝑘𝑙 ℎ𝑓 𝑛 ⎥ 2−𝑙
⎢ 𝑛=0 ⎥ 𝑙=1 ⎢ 𝑛=0 ⎥
⎣ ⎦ ⎣ ⎦
(31)
⎡𝑁𝑏 +𝑁
∑𝑓
−1 ⎤ 𝐿−1
∑ ⎡ 𝑏 ∑𝑓
𝑁 +𝑁 −1 ⎤
−⎢ ℎ̄𝑏𝑛 𝑏𝑛0 ⎥ + ⎢ 𝑏𝑘𝑙 ℎ̄𝑏𝑛 ⎥ 2−𝑙
⎢ 𝑛=𝑁 ⎥ 𝑙=1 ⎢ 𝑛=𝑁 ⎥
⎣ 𝑓 ⎦ ⎣ 𝑓 ⎦
The first term in Eq. (31) denotes the PP of FF filter and second
term represents the PP of FB filter. It is exhibited an adaptive DFE
design with a modified DA. The FF filter unit, error signal, FB filter unit,
control circuit,& slicer device make up adaptive DFE. The modified DA
architecture is used to design both the FF & FB filter unit. The decision
device is supplied the difference between the outputs of the FF and FB
filter blocks, which is 𝑆(𝑘). The decision device checks the output 𝑆(𝑘)
to see if the signal is within the range of the signal and quantizes it
according to the modulated method. The procedure will be repeated
until DFE returns a zero error. The suggested adaptive DFE is put to
the test using the BPSK modulator approach, as shown in Fig. 13.
Consider a collection of channel impulse response signals that have
been BPSK modulated with a message signal. The signals are transferred
to the adaptive DFE to remove noise and ISI errors from the produced
Fig. 14. Original and ISI noises Ref. [10]. signal. The FF filter block removes the ISI’s precursor and anti causal
parts, whereas the FB filter unit removes other noise & error signals.
The adaptive DFE will run until the adaptive DFE’s decision device
𝑍 𝑇 = [𝑥(𝑘)......𝑥(𝑘 − 𝑁𝑓 + 1), 𝑟(𝑘), 𝑟(𝑘 − 1), … ...𝑟(𝑘 − 𝑁𝑏 + 1)] propagates a zero value. Fig. 14 depicts the original noise signal as well
as the filtered ISI free signal.

𝐾−1
𝑆(𝑘) = ℎ𝑘 𝑧(𝑘 − 𝑛) (22)
7. Conclusion
𝑛=0

The 2’s complement form for z(k-n) is


In comparison to the existing filter design, the suggested pipelined

𝐿−1 DA based adaptive filter has a less complex construction. By using
𝑧(𝑘 − 𝑛) = −𝑏𝑛0 + 𝑏𝑛𝑙 2−𝑙 (23) pipelined concept the number of registers usage is reduced. The pro-
𝑙=0
posed design takes considerably less storage and uses less power. The
On substituting z(k-n) in s(k) then we get proposed design further reduced area by using proposed compressor
[𝐾−1 ] 𝐿−1 [𝐾−1 ]
∑ ∑ ∑ adders instead of conventional adders. The proposed adaptive filters
𝑠(𝑘) = − ℎ𝑛 𝑏𝑛0 + 𝑏𝑛𝑙 ℎ𝑛 2−𝑙 (24) have less power consumption when compared with existing architec-
𝑛=0 𝑙=0 𝑛=0 tures. The proposed design is more suitable for all signals processing
When we observe the Eq. (24), it is similar to the Eq. 8, so the FF and application especially decision feedback equalizers to remove the ISI
FB filters can be combined and implemented using DA. For updating the noises.
filter coefficient we are choosing LMS algorithm.
The LMS algorithm for filter coefficient updating is: Declaration of competing interest
ℎ𝑓 𝑚 (𝑘 + 1) = 𝜇𝑓 𝑒(𝑘)𝑥(𝑘 − 𝑚) + ℎ𝑓 𝑚 (𝑘) (25)
The authors declare that they have no known competing finan-
cial interests or personal relationships that could have appeared to
ℎ𝑏𝑝 (𝑘 + 1) = 𝜇𝑏 𝑒(𝑘)𝑟(𝑘 − 𝑝) + ℎ𝑏𝑝 (𝑘) (26)
influence the work reported in this paper.
where 𝜇𝑓 and 𝜇𝑏 are the step sizes of FF and FB filter respectively and
e(k) is the error signal and is given as References

𝑒(𝑘) = 𝑠𝑞 (𝑘) − 𝑠(𝑘) (27) [1] Stenger Alexander, Lutz Trautmann, Rudolf Rabenstein, Nonlinear acoustic echo
cancellation with 2nd order adaptive Volterra filters, in: Acoustics, Speech, and
The negative value of ℎ𝑏𝑝 (𝑘 + 1) is Signal Processing, 1999. Proceedings. 1999 IEEE International Conference on.
vol. 2, IEEE, 1999.
ℎ̄𝑏𝑝 (𝑘 + 1) = 𝜇̄𝑏 𝑒(𝑘)𝑟(𝑘 − 𝑝) + ℎ̄𝑏𝑝 (𝑘) + (28) [2] Thakor Nitish V., Y.-S. Zhu, Applications of adaptive filtering to ECG analy-
sisnoise cancellation and arrhythmia detection, IEEE Trans. Biomed. Eng. 38.8
Multiplying Eq. (25) and Eq. (28) with 𝑏𝑚𝑙 and 𝑏𝑝𝑙 respectively and (1991) 785–794.
introducing summation we get [3] Ghamkhari Seyedeh Fatemeh, Mohammad Bagher Ghaznavi-Ghoushchi, A new
𝑁𝑓 −1 𝑁𝑓 −1 𝑁𝑓 −1 low-power architecture design for distributed arithmetic unit in FIR filters
∑ ∑ ∑ implementation, Circuits Systems Signal Process. 33.4 (2014) 1245–1259.
𝑏𝑚𝑙 ℎ𝑓 𝑚 (𝑘 + 1) = 𝑏𝑚𝑙 ℎ𝑓 𝑚 (𝑘) + 𝜇𝑓 𝑒(𝑘) 𝑏𝑚𝑙 𝑥(𝑘 − 𝑚) (29)
[4] NagaJyothi Grande, Sriadibhatla SriDevi, Distributed arithmetic architectures for
𝑚=0 𝑚=0 𝑚=0
fir filters-a comparative review, in: 2017 International Conference on Wireless
𝑁𝑏 −1 𝑁𝑝 −1
∑ ∑ Communications, Signal Processing and Networking, WiSPNET, IEEE, 2017.
𝑏𝑏𝑝 ℎ̄𝑏𝑝 (𝑘 + 1) = 𝑏𝑚𝑙 ℎ̄𝑝𝑙 (𝑘)+ [5] Mohanty Basant K., Pramod Kumar Meher, Sujit K. Patel, LUT optimization for
𝑝=0 𝑝=0 distributed arithmetic-based block least mean square adaptive filter, IEEE Trans.
(30)
𝑁𝑏 −1 Very Large Scale Integr. (VLSI) Syst. 24.5 (2016) 1926–1935.

𝜇̄𝑏 𝑒(𝑘) 𝑏𝑝𝑙 𝑟(𝑘 − 𝑝) [6] Croisier Alain, et al., Digital filter for PCM encoded signals. U.S. Patent No. 3,
𝑝=0
777, 130. 4 Dec. 1973.

10
K. Vijetha and Rajendra Naik B. Microprocessors and Microsystems 93 (2022) 104577

[7] Peled Abraham, Bede Liu, A new approach to the realization of nonrecursive [24] Prakash M. Surya, Rafi Ahamed Shaik, Sagar Koorapati, An efficient distributed
digital filters, IEEE Trans. Audio Electroacoust. 21.6 (1973) 477–484. arithmetic-based realization of the decision feedback equalizer, Circuits Systems
[8] Zohar halhav, A realization of the RAM digital filter, IEEE Trans. Comput. 25.10 Signal Process. 35.2 (2016) 603–618.
(1976) 1048–1052. [25] Jiang Honglan, et al., A high-performance and energy-efficient FIR adaptive filter
[9] White Stanley A., Applications of distributed arithmetic to digital signal using approximate distributed arithmetic circuits, IEEE Trans. Circuits Syst. I.
processing: A tutorial review, IEEE Assp Mag. 6.3 (1989) 4–19. Regul. Pap. 66.1 (2018) 313–326.
[10] NagaJyothi Grande, Sriadibhatla Sridevi, High speed and low area decision feed- [26] Berberidis Kostas, Thanasis A. Rontogiannis, Sergios Theodoridis, Efficient block
back equalizer with novel memory less distributed arithmetic filter, Multimedia implementation of the decision feedback equalizer, IEEE Signal Process. Lett. 5.6
Tools Appl. (2019) 1–15. (1998) 129–131.
[11] Allred Daniel J., et al., LMS adaptive filters using distributed arithmetic for high [27] Parhi Keshab K., Design of multigigabit multiplexer-loop-based decision feedback
throughput, IEEE Trans. Circuits Syst. I. Regul. Pap. 52.7 (2005) 1327–1337. equalizers, IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 13.4 (2005)
[12] Guo Rui, Linda S. DeBrunner, Two high-performance adaptive filter implemen- 489–493.
tation schemes using distributed arithmetic, IEEE Trans. Circuits Syst. II: Express [28] Lin Chi-Shiung, et al., Concurrent digital adaptive decision feedback equalizer
Brief. 58.9 (2011) 600–604. for 10GBase-LX4 ethernet system, in: 2007 IEEE Custom Integrated Circuits
[13] Jyothi Grande Naga, Sridevi Sriadibhatla, Asic implementation of low power, Conference, IEEE, 2007.
area efficient adaptive fir filter using pipelined da, in: Microelectronics, [29] Almogahed Abdullah, et al., Performance improvement of mode division multi-
Electromagnetics and Telecommunications, Springer, Singapore, 2019, pp. plexing free space optical communication system through various atmospheric
385–394. conditions with a decision feedback equalizer, Cogent Eng. 9.1 (2022) 2034268.
[14] Jyothi Grande Naga, Sriadibhatla Sridevi, Low power,low area adaptive finite [30] Han Ruigang, et al., Joint time-frequency domain equalization of MSK signal
impulse response filter based on memory less distributed arithmetic, J. Comput. over underwater acoustic channel, Appl. Acoust. 189 (2022) 108597.
Theor. Nanosci. 15.6-7 (2018) 2003–2008.
[15] Haridas Nisha, Elizabeth Elias, Efficient variable bandwidth filters for digital
hearing aid using farrow structure, J. Adv. Res. 7.2 (2016) 255–262.
[16] D.J. Allred, H. Yoo, V. Krishnan, W. Huang, D.V. Anderson, A novel high K Vijetha received her B.Tech and M.Tech degree in
performance distributed arithmetic adaptive filter implementation on an FPGA, Electronic and Communication Engineering from Jawaharlal
in: Proc. IEEE Int. Conf. Acoust. Speech, Signal Process, May, vol. 5, 2004, pp. Nehru Technology University, Andhra Pradesh, India, in
V–161–V–164. 2007 and 2010. She is working as assistant professor in
[17] D.J. Allred, H. Yoo, V. Krishnan, W. Huang, D.V. Anderson, LMS adaptive filters Electronics and Communication Engineering Department of
using distributed arithmetic for high throughput, IEEE Trans. Circuits Syst. I, Matrusri Engineering College , Hyderabad, Telangana, from
Reg. Pap. 52 (7) (2005) 1327–1337, Jul.. September 2012to till date . Currently, she is working
[18] R. Guo, L.S. DeBrunner, Two high-performance adaptive filter implementation- towards her Ph.D degree in VLSI System Design at Osmania
schemes using distributed arithmetic, IEEE Trans. Circuits Syst. II, Exp. Brief. 58 University, Hyderabad, India. Her research interest includes
(9) (2011) 600–604, Sep.. Low power IC design and VLSI DSP.
[19] R. Guo, L.S. DeBrunner, A novel adaptive filter implementation scheme using
distributed arithmetic, in: Conf. Rec. 45th ASILOMAR, Nov, 2011, pp. 160–164.
Prof. B. Rajendra Naik obtained his Bachelor’s degree
[20] Park Sang Yoon, Pramod Kumar Meher, Low-power, high-throughput, and low-
in Electronics and Communications Engineering from Na-
area adaptive FIR filter based on distributed arithmetic, IEEE Trans. Circuits Syst.
garjuna University, Master’s degree in Digital Systems and
II: Express Brief. 60.6 (2013) 346–350.
Doctorate degree from Osmania University. He has pub-
[21] Baghel Sudhanshu, Rafiahamed Shaik, FPGA implementation of fast block
lished over 60 research papers in National, International
LMS adaptive filter using distributed arithmetic for high throughput, in: 2011
Conferences and Journals. He has 18 years of teaching
International Conference on Communications and Signal Processing, IEEE, 2011.
and 12 years of research experience. Prof. B. Rajendra
[22] Mohanty Basant Kumar, Pramod Kumar Meher, A high-performance FIR filter
Naik joined in Osmania University in the year 2001. He is
architecture for fixed and reconfigurable applications, IEEE Trans. Very Large
Professor in Electronics and Communication Engineering,
Scale Integr. (VLSI) Syst. 24.2 (2015) 444–452.
University College of Engineering, Osmania University. His
[23] Tang Zhangwen, Jie Zhang, Hao Min, A high-speed programmable, CSD
research interests include VLSI signal processing, Signal
coefficient FIR filter, IEEE Trans. Consum. Electron. 48.4 (2002) 834–837.
integrity performance improvement and image processing.

11

You might also like