0% found this document useful (0 votes)
209 views7 pages

Design of Higher Order Multiplier With Approximate Compressor

Dft

Uploaded by

shresthanagesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
209 views7 pages

Design of Higher Order Multiplier With Approximate Compressor

Dft

Uploaded by

shresthanagesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/344666802

Design of Higher Order Multiplier with Approximate Compressor

Conference Paper · October 2020


DOI: 10.1109/CONECCT50063.2020.9198611

CITATIONS READS

0 9

3 authors, including:

Deepa Thangavel
SRM Institute of Science and Technology
47 PUBLICATIONS   66 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Visible Light D2D Communication via LED View project

All content following this page was uploaded by Deepa Thangavel on 15 October 2020.

The user has requested enhancement of the downloaded file.


Design of Higher Order Multiplier with
Approximate Compressor
M.Maria Dominic Savio, Assistant Professor, T.Deepa, Associate Professor,
Department of Electronics and Communication Department of Electronics and Communication
Engieerning, Engieerning,
SRM Institute of Science and Technology - 603203, SRM Institute of Science and Technology - 603203,
Tamilnadu,India. Tamilnadu, India.
deepat@srmist.edu.in
mariadom@srmist.edu.in

Abstract — In recent years imprecise multiplier has been Several imprecise compressors are proposed in [6] – [9]
widely studied for image processing applications; this imprecise and used in various multiplier architecture for image
multiplier is done through compressors. For imprecise processing application with inaccurate execution but the
multiplication when the multiplication width is large then higher resultant errors are tolerable. The multiplier is done with three
compressor adders are used to reduce the reduction stage. The steps as follows 1) partial product generation 2) partial product
challenging task in higher compressor approximation via truth reduction stage 3) Addition of reduced terms by adders. Out of
table, K-map is impossible. In this paper, the 8:2 compressor is this the partial product reduction is more complicated,
designed and a novel comparison technique is developed for consumes more power and creates delay. The Wallace and
approximation. The proposed 8:2 compressor is used in 16x16
Dadda tree architecture is a very efficient method for partial
multiplier and compared with existing multiplier. The new novel
compressor is efficient in area, power, and delay. Another
product reduction. In the reduction stage, the employment of
performance characteristic of error distance (ED) and the compressor ensures more effective when considering to the
normalized error distance (NED) is compared between related full-adder. The 4:2 compressors are suitable for all kind of
works. The proposed multiplier used in image multiplication then multiplier [6]. But it is limited to 8x8 multiplications in the
the PSNR is compared. case of 16x16, 32x32, and 64x64, because again the reduction
stage will increase. So in [9] 15:4 compressors are used in
Keywords—Approximate compressor; Normalized Error 16x16 multiplications.
Distance (NED) ; Multiplier ; image processing.
The approximate compressor plays a key role in low power
circuits. An approximation can be achieved through K- map
and truth table for lower order compressor. In K – map just
I. INTRODUCTION eliminating the essential prime implicant so as to reduce the
To improve the energy efficiency of digital processing device hardware. In the truth table, by correlating the input
systems (DPS), imprecise computing has been evolved. The versus output, then the maximum correlated input and output
method of imprecise computing is generally achieved by got bypassed without any hardware. In both case, the tradeoff
approximating some output function as input, and circuit is maintained between the device hardware and image quality.
component will reduce. The DPS performs many operations The quality of the image is measured through the peak signal
like convolution, correlation, filtering of signals. These to noise ratio (PSNR).The PSNR is 30dB is enough for most of
operations are done through multiplier, subtractor, shifter, the applications [9]. The acceptable range of PSNR value is
adder, divider, and comparator. If the DSP processor is used above 20dB in [10], the literature developed multiplier for
for image processing then approximate arithmetic method is image sharpening application. So depending on the application
used to reduce computational complexity with tolerable error it may vary all the researchers concentrated to make as high as
without affecting the performance in application. With logic possible between the range of 25-35dB. The demand for the
level simplification using Karnaugh map (K-map) four higher-order compressor is raised when the multiplier width is
approximate subtractors are developed in [1] and it has been increased; the approximation over the higher order compressor
used to develop approximate divider for background is difficult. The approximation done in 5:3 compressor, and
subtraction in image processing application with low power used as the sub-component to develop the 15:4 compressor.
and low area computing. Several approximate comparator in The error distance is only calculated in 5:3 compressors in [9],
[2]-[4] is designed to prove low cost in terms of power, area, and the pass rate is calculated for 15:4 compressor.
speed and also used to remove the salt and pepper noise so that This work addressed the issue of approximation in higher-
the degradation of quality is not affected the performance. order compressor by a novel comparison method between
Compressor designs have been emerged for the alternate to inputs and outputs. And the energy-efficient 8:2 compressor is
full-adder in the reduction tree stage of Wallace and Dadda proposed. The rest of the paper organized as follows. Design of
multiplier. Normally compressor is designed with full-adders 8:2 compressors elaborated in section II. The design
in [5] novel compressor 4:2, 5:3 compressors are designed with of different approximate compressors using a novel
XOR - MUX architecture to ensure low power and area.

978-1-7281-6828-9/20/$31.00 ©2020 IEEE


comparison technique is described in section III. Section IV
describes the design of 8x8, 16×16 multiplier. The performance
analysis is described in section V. Image multiplications are
done with proposed multiplier is given in section VI. Finally,
the conclusion is presented.
II. Design of 8:2 compressor
The design of several 15:4, 9:4, 8:2 compressors are
proposed in [9], [11], [12], respectively. All these designs are
developed using full-adder and lower order compressors. The
approximation feasibility overall this design is very low. The
Proposed 8:2 compressor is designed with the straight forward
approach with the parallel stream of input to output through
XOR – MUX architecture as demonstrated in Fig.1. (b)
A0
Fig.2. (a) 4:2 compressor using full-adder (b) 4:2 compressor based on
XOR – MUX [11]
X M C0
M 2:1 Mux X XOR
With the same approach using three 4:2 compressors 8:2
compressor has constructed in [12] and [13]. The work
A1
X
A2 proposed in this paper done with XOR – MUX so as to reduce
X M C1
the power and delay, but area stands the same. The equation for
A3
the sum stands the same for all compressors XOR of all inputs,
X
A4
and every cout is computed by XOR output first two inputs fed
X M C2
to MUX select line and MUX inputs are the first and the third
A5 one. The sum equation is given in eqn-1 and different carry
X equations are possible given in eqn-2, and carry equation based
A6 on XOR taken into account for the production of sum and
X M C3
cascading up-to end-stage of carry. The sum and single-stage
A7
carry equations are given by.
X
Ci0 Sum = a0⊕a1⊕a2⊕a3⊕a4⊕a5⊕a6⊕a7⊕ci0⊕ci1⊕ci2⊕ci3⊕ci4… (1)
CARRY
X M C4

Ci1 Carry = ((a⊕b) c) + ((a ⊕ b) a) …. (2.1)


X
Ci2
= (a.b) +(b.c)+ (c.a) …. (2.2)
X M = ((a⊕b) c)+(b.a) .... (2.3)
Ci3

X
= ((a+b) c)+(b.a) …. (2.4)
Ci4
SUM While looking the above equations, equation (2.4) is the
Fig.1. 8:2 compressor designed by XOR – MUX
simplest way to implement, but considering both sum and
different stages of cout in this design eqn (1) & (2.1) have taken
The compressors are used in the multiplier for the tree for the perfect construction 8:2 compressor without any usage
reduction stage usually made up of full-adder. The full-adder is the of lower-order compressor. If the compressor does not
named as 3:2 compressors or counter is usually used for the consist lower order compressor approximation task will be
construction for any higher-order compressor. One full-adder achieved any part of the circuit.
will be constructed with two XOR’s and one MUX is proposed
in [5] so as reduce the area and power without any change in III. DESIGN OF APPROXIMATION TECHNIQUE FOR 8:2
COMPRESSORS
the truth table. With the same idea the many compressors are
constructed in [5], [7] were the 4:2 compressors are shown in The approximate computing is the major concern in
Fig.2. reducing the power, area, and delay. Novel approximation
technique is presented in this paper. The proposed 8:2
compressors consist of 13 inputs and having 213 = 8912 input
combinations. The circuit consists of 7 output cout0 – cout4, sum,
carry. The previous work in [9] approximation is done in a
lower-order compressor with tolerable error and used in the
construction of higher-order compressors, so the erroneous in
the higher-order compressor is not calculated accurately. This
work overcomes the above problem by creating the architecture
comparing all the inputs to all the outputs for the accurate
calculation of error that can be created by approximating in any
(a) part of the circuit. The flow chart shown in Fig.3 which
demonstrates the correlation of every input to every output.

978-1-7281-6828-9/20/$31.00 ©2020 IEEE


multiplier with Dadda structure and reduction stage is shown in
Initilize -> Compressor Input Ki; i=1,2,3....13; Output Hx; x=1,2....7; counter enable Fig.4.
z=i*x; z=1 to 91;

Assign
Compartor input1 from comp input, i=1,2,3....13;
Comparator input2 from compressor output, x=1,2...7;
C11,C12,.....C21,C22..... C136,C137;
counter enable from comparator output z=i*x; z=1 to 91;
z1=C11.........z91=C137

i=0
x=8

N
i++ x++

N
ci=cx Oj=Oj;j++
i=14 N x=0
Y
Y

collect all counter


outputs Oj; j = 1,2…..91
Oj=Oj+1;j++
Fig.4. 16x16 multiplier with 4:2 compressor

B.Design-1 16x16 multiplier with exact 8:2 compressor


end
Y = yes The proposed exact XOR-MUX 8:2 compressor is used to
N = no build the 16x16 multiplier as shown in Fig.5.

Fig.3. Flow chart for approximating 8:2 compressor

The proposed approximation finder circuit consists of 91


counter and comparator. Each comparator is inputted with one
compressor input and one compressor’s output so that the input
of compressor ranges from a0 to a7 and ci0 to ci4 on total 13, the
outputs of the compressor range from cout0 to cout4, sum, carry.
The comparator-1 is inputted with a0 and cout0, comparator-2
with a0 and cout1 and so on, in this way the 91th comparator
inputted with ci4 and carry. These comparator outputs are fed
to counters. The equality level for every input to output
combinations is identified by 91 counter outputs for the entire
8912 input samples.
With these accurate data, the different tolerable levels of
approximation can performed without affecting the image
quality.
IV. DESIGN OF MULTIPLIER
The compressors are used in several multiplier architectures Fig.5. 16x16 multiplier with 8:2 compressor
for the exact multiplication process. The approximate
The multiplier with 8:2 compressor has only three reduction
compressor is used in discrete cosine transform (DCT)
stages is less than the 4:2 compressor has the reduction stage of
operation in [8] for image processing applications. This paper
four, thereby it is efficient to use higher-order compressor if the
proposed with several designs of approximate multiplier
multiplier width is increased. In Fig.5 the color code is used to
Design-1 16x16 multiplier with exact 4:2 compressor,
identify different components: pink – 8:2 compressors, red –
Design-2 16x16 multiplier with exact 8:2 compressor,
4:2 compressors, thick and light blue – full-adders, orange –
Design-3 16x16 multiplier with approximate 8:2 compressor,
half-adder.
Design-4 8x8 multiplier with exact 4:2 compressor [6],
Design-5 8x8 multiplier with 8:2 approximate compressor. C.Design-3 16x16 multiplier with approximate 8:2 compressor
A.Design-1 16x16 multiplier with exact 4:2 compressor With the approximation finder method, many inputs are
matched with outputs for 75% of input combination is
The 16x16 multiplier is designed with existing exact 4:2
discussed in section-V, In our proposed approximate 8:2
compressors in [6]-[8], and the performance metrics are
compressor cin4 is bypassed to carry, so as to reduced area,
compared with the proposed multiplier. The design of
power, delay without affecting the image quality.

978-1-7281-6828-9/20/$31.00 ©2020 IEEE


D. Design-4 8x8 multiplier with exact 4:2 compressor [6]
To compare the efficiency of the proposed compressor the
design in reference [6] is designed with 4:2 exact compressors.
E. Design5 8x8 multiplier with 8:2 approximate compressors.
The 8x8 multiplier in [6] designed with 4:2 compressor, full-
adder, and half-adder. The proposed multiplier is designed with
an 8:2 compressor where the possibilities occurs only in 7th to
9th column in the first stage the remaining stage stands the
same as shown in Fig.6. In [6] two imprecise compressors are Fig.7. (a) 8:2 compressor using full-adder (b) 8:2 compressor based on
designed and the performance metrics like PSNR, ED, NED, XOR-MUX
Area, power, delay. In this paper, the designed 8x8 multiplier Table I.Power Comparison 8:2 compressor
using an exact 8:2 compressor and the above performance
metrics is compared with the exact 4:2 compressor of [6]. S.no Design Power (µw)

1. BASED ON FULL-ADDER 0.605

2. BASED ON XOR-MUX 0.493

B. Approximation
The Verilog simulation result of the counter shows the
correlational value of inputs versus outputs as shown in Fig .8.

Fig.8. Approximation for 8:2 compressor

Fig.8. shows that the a0cout0 combination is 6144 times equal


out of 8192 cycles 75% to the overall cycle. TABLE II shows
Fig.6. Replacement of 8:2 compressor in [6] the highly correlated combination.
V. RESULTS AND DISCUSSION TABLE II. Input/Output correlations
A.POWER COMPARSION OF 8:2 COMPRESSOR
Correlated Correlated
Compressors are generally made up of full-adder. Later Input/Output Input/Output
cycles out of cycles out of
full-adder is modified with two XOR’s and one MUX in [8] so combination combination
8192 8192
as reduce the area and power without any change in the truth
table. The proposed 8:2 compressor is designed with a cascade a0cout0 6144 a7cout3 6144
of full-adder in CMOS 90nm library and the power metrics
compared with 8:2 compressors with XOR-MUX as shown in a1cout0 6144 cin0cout3 6144
Fig.7 and Table I shows the power comparison a2cout0 6144 cin1cout4 6144
a3cout1 6144 cin2cout4 6144
a4cout1 6144 cin3carry 6144
a5cout2 6144 cin4carry 6144
a6cout2 6144 cin4sum 4096

From TABLE II so many approximations can be performed. But


approximation over a least significant bit (LSB) reduces the
error distance (ED) and normalized error distance (NED), so

978-1-7281-6828-9/20/$31.00 ©2020 IEEE


cin4 is approximated as Carry in our compressors to reduce the
hardware. The Approximate compressor result is compared
with the exact compressor to verify the erroneous. The design
equation for NED is described in [6]. The TABLE III shows the
NED values of the proposed and existing design.
TABLE III. Accurateness of imprecise 8x8 multiplier
Design NED
Imprecise [6] 0.05061
Proposed 0.05070
X X
From the results, it is clearly shown that the proposed
multiplier produces an acceptable error.

C. Results of 16x16 bit multipliers.


The design of the existing and proposed multiplier is done with
Verilog HDL. The power, area, the delay is calculated in the
Cadence RTL compiler with 90nm technology library.

TABLE IV. Analysis of 16x16 multipliers


Design Area(µm2) Power(µw) Time(ns)
16x16 multiplier
= =
with 15:4 Exact 4939 570 4.2
compressor [9]
16x16 multiplier
4955 585 4.7
with 4:2 compressor [6-8]

16x16 multiplier
with 8:2 exact compressor 4930 565 4.3
with XOR-MUX
[EXACT] [EXACT]
16x16 multiplier
with approximate 8:2 4688 534 4.0
compressor with XOR-
MUX
From TABLE IV it is shown that the power and area of the
proposed multiplier are decreased by 8% while the delay is
increased to 9%. The proposed inexact multiplier is better than
the entire existing models in all performance metrics.
D. Image processing Application.
[9] PSNR = 40.6dB [9] PSNR =39.2dB

Proposed PSNR = 43.2dB Proposed PSNR = 42.2dB


(b) (c)
Exact [9] Proposed Fig.9. a) Multiplication of two different image b) Squaring of image with
PSNR=28.6 dB PSNR = 29.2 dB all range of pixel c) Squaring of standard test image
(a)

978-1-7281-6828-9/20/$31.00 ©2020 IEEE


The various image multiplication outputs of the existing and [4] Monajati, M., Fakhraie, S.M. and Kabir, E., 2015. Approximate
proposed multiplier as shown in Fig.9. The output of the arithmetic for low-power image median filtering. Circuits, Systems, and
SignalProcessing, 34(10),pp.3191-3219
images (a) is the multiplication of two images and the PSNR
[5] Chang, C.H., Gu, J. and Zhang, M., 2004. Ultra low-voltage low-power
values are verified. The output of the images (b) is the CMOS 4-2 and 5-2 compressors for fast arithmetic circuits. IEEE
multiplication of the same images, the image is chosen in the Transactions on Circuits and Systems I: Regular Papers, 51(10),
way that consists of all pixels from low value to high value to pp.1985-1997.
ensure the design will fit for all images. Images (c) are the [6] Taheri, M., Arasteh, A., Mohammadyan, S., Panahi, A. and Navi, K.,
standard test image used in most image processing research. 2020. A novel majority based imprecise 4: 2 compressor with respect to
the current and future VLSI industry. Microprocessors and
V. CONCLUSION Microsystems, 73, p.102962.
[7] Moaiyeri, M.H., Sabetzadeh, F. and Angizi, S., 2018. An efficient
The new method of 8:2 approximate compressor designs majority-based compressor for approximate computing in the nano
was proposed in this paper. These approximate compressors era. Microsystem Technologies, 24(3), pp.1589-1601.
are used to design a 16x16 multiplier. The performance metrics [8] Gorantla, A., 2017. Design of approximate compressors for
of the approximate design provide better results with a multiplication. ACM Journal on Emerging Technologies in Computing
tolerable error. The proposed multiplier produces almost the Systems (JETC), 13(3), pp.1-17.
same range of PSNR value with previous design. The area and [9] Marimuthu, R., Rezinold, Y.E. and Mallick, P.S., 2016. Design and
power of the new design are effective, but the latency is not analysis of multiplier using approximate 15-4 compressor. IEEE
Access, 5,pp.1027-1036.
improved so depends on their applications the researchers can
[10] Guo, Y., Sun, H., Guo, L. and Kimura, S., 2018, October. Low-cost
choose the multiplier. In the future, this kind of approximate approximate multiplier design using probability-driven inexact
arithmetic can be focused on different areas of processor that compressors. In 2018 IEEE Asia Pacific Conference on Circuits and
are used in image processing so as to reduce area, power, and Systems (APCCAS) (pp. 291-294). IEEE.
delay. [11] Marimuthu, R., Bansal, D., Balamurugan, S. and Mallick, P.S., 2013.
Design of 8-4 and 9-4 Compressors Forhigh Speed
REFERENCES Multiplication. American Journal of Applied Sciences, 10(8),p.893.
[1] Gorantla, A. and Deepa, P., 2019. Design of Approximate Subtractors [12] Silveira, B., Paim, G., Abreu, B., Grellert, M., Diniz, C.M., da Costa,
and Dividers for Error Tolerant Image Processing Applications. Journal E.A.C. and Bampi, S., 2017. Power-efficient sum of absolute differences
of Electronic Testing, pp.1-7. hardware architecture using adder compressors for integer motion
estimation design. IEEE Transactions on Circuits and Systems I:
[2] Kim, Y., Zhang, Y. and Li, P., 2014. Energy efficient approximate
Regular Papers, 64(12), pp.3126-3137.
arithmetic for error resilient neuromorphic computing. IEEE
Transactions on Very Large Scale Integration (VLSI) Systems, 23(11), [13] Schiavon, T., Paim, G., Fonseca, M., Costa, E. and Almeida, S., 2016.
pp.2733-2737 Exploiting adder compressors for power-efficient 2-D approximate DCT
realization. In 2016 IEEE 7th Latin American Symposium on Circuits &
[3] Zhou, Y., Lin, J., Wang, J. and Wang, Z., 2018, October. Approximate
Systems (LASCAS) (pp. 383-386). IEEE.
Comparator: Design and Analysis. In 2018 IEEE International
Workshop on Signal Processing Systems (SiPS) (pp. 1-5). IEEE.

978-1-7281-6828-9/20/$31.00 ©2020 IEEE

View publication stats

You might also like