8 Vol 77 No 1
8 Vol 77 No 1
net/publication/282992674
CITATIONS READS
2 1,920
3 authors:
Adel Naeem
Universiti Sains Malaysia
7 PUBLICATIONS 8 CITATIONS
SEE PROFILE
All content following this page was uploaded by Sabri M. Hanshi on 05 December 2015.
ABSTRACT
Acoustic Echo Cancellation (AEC) has become a necessity in today’s conferencing system in order to
enhance the audio quality of hands-free communication systems. In recent years, many researchers and
manufacturers have developed various AEC algorithms for telecommunication solutions in order to
improve the quality of service. Many factors influence the design of an AEC system, such as computational
complexity, memory consumption etc. The aim of this work is to review the most recent acoustic echo
cancellation techniques and their applicability for current hands free applications. Therefore, this paper
presents AEC systems challenges and comparison between these techniques is also presented.
Keywords: Adaptive Filter, Acoustic Echo Cancellation, Noise Reduction, Voice Over Internet Protocol
69
Journal of Theoretical and Applied Information Technology
th
10 July 2015. Vol.77. No.1
© 2005 - 2015 JATIT & LLS. All rights reserved.
In addition to AEC techniques that solve sampling In the other hand, Frequency Domain Adaptive
rate mismatch between far-end and near-end Filter (FDAF), which was proposed in 1992 by
signals. The rest of paper is divided into four Shynk [11], is designed to achieve fast convergence
sections. Section 2 discusses about the different rate and low computational cost, where desired
adaptive algorithms proposed in AEC. In Section 3 signal and input signal are transformed into discrete
and 4, strength and weaknesses of each AEC frequency domain using discrete Fourier transform
systems will be analyzed. Section 5 concludes the (DFT). Instead of linear convolution and correlation
review and ideas for future work. that are performed in LMS and NLMS adaptive
filter, circular operation is performed in frequency
2. ADAPTIVE FILTERS domain on a block-by-block rule instead of sample
As shown in Figure. 1, adaptive filter will by sample in LMS and NLMS [6, 9]. According to
generate a replica of the echo, y(n) and the [9] FDAF has attractive computational and
estimated echo is subtracted from the desired input convergence rate when the block size has the same
signal d(n) yielding the estimated error signal, amount of filter length. The major drawbacks of
FDAF is has long delay due to some restore
e(n) = d(n) – y(n) (1) operation that is required to perform the circular
operation [12]. By splitting the impulse response
The estimated error signal will be piggybacked to into equal parts to produce time and frequency
the adaptive filter so that it can self-adjust the convolution mixed together, leads to new version of
transfer function to achieve optimum performance FDAF called Partitioning Block FDAF filter
[3]. (PBFDAF). In PBFDAF, the length of block can be
Least Mean Square (LMS) algorithm, a adjusted to achieve cheap acoustic echo canceller
stochastic gradient-based algorithm, is one of the with acceptable level of delay [9, 12].
most widely used algorithms in adaptive filtering. It Subband adaptive filter (SAF) [8, 13] is designed
is well known for its simplicity in computation and to exploit the subband properties to perform more
implementation [4]. However LMS algorithm is efficient signal processing. The input signals in
very sensitive to the spectral and power of input subband are decomposed into multiple parallel
signal which makes it hard to adjust the step size channels and synthesis to construct the fullband
and guarantee the stability of the algorithm[5, 6]. signal at the output. Thus, the input signal and
As such, normalized convergence parameter is output signal are decomposed into N spectral bands
developed to resolve this problem by normalizing using analysis filters, and each filter has an
the step size with power of input signal, resulting independent adaptation feedback and it computes
the convergence rate independent from signal its error internally. The fullband error signal is
power [7]. The advantage of new the algorithm, constructed using synthesis filter bank. LMS and
Normalized LMS (NLMS), is noticed when power NLMS can be adopted to subband adaptive filter to
of input signal is changing, making it suitable to minimize the Least Mean Error (LME), several
predict echo. However it requires additional types of SAF are proposed and explained in [8].
computational multiplication for normalization Delayless structure of closed loop was found more
terms. Both LMS and NLMS have slow suitable for real time application such AEC system.
convergence rate when the input signal are highly However, the decomposed the input signal and
correlated [8, 9]. synthesis the fullband error signal introduces delay
Gradient-based LMS-algorithm (Widrow-Hoff) which undesired in real time AEC system. A
or a recursive least squares (RLS) are complex comparison study had done by [12] between SAF
especially for full band implementation. The and FDAF concludes that in real time acoustic echo
dynamic characteristic of speech including intervals cancellation, SAF introduces unwanted delay and
of complete silence is proven to be a problem in suffer from residual errors while FDAF does not
adaptive filtering [10]. In addition the far from suffer from such problems equivalent to SAF.
white spectral character slows down the adaptation The sub-and realization will be able to reduce the
speed causing long convergence time and making complexity by dividing the signal into and
the system sensitive to changes of the acoustic applying adaptive filters to a decimated signal in
room response. Finally the near-end speech and each sub-band. In addition the spectral variability
background noise if present also put demands on within a sub and is reduced as compared to the full
the system design. band signal. To maintain transparency of the near
70
Journal of Theoretical and Applied Information Technology
th
10 July 2015. Vol.77. No.1
© 2005 - 2015 JATIT & LLS. All rights reserved.
end speech signal one will require the cascade of different devices.
the analysis and synthesis filter banks to provide Arbitrary Sampling Rate Conversion (ASRC) is
perfect reconstruction. found more suitable to correct the small change in
sampling rate rather than SRC which is used for
3. SAMPLING RATE MISMATCH rational conversion. Variable Fractional Delay
Besides choosing the right adaptive filter, there ASRC (VFD ASRC) is used to correct the sampling
are other factors that may impact the performance rate (increase or decrease) of digital signals [20].
of the AEC system. For example different sampling An estimation method of sampling rate offset is
frequencies of D/A and A/D converters can degrade propose d in [21] by extend the LMS algorithm,
the voice quality. The deterioration of performance and then correct it through two mechanisms: frame-
is due to the nonlinear time-varying disturbances of step control and phase rotation. While, Pawig [14]
the effective echo path caused by the offset, as well suggested using a least mean squares (LMS)
as buffer overflow or underflow [14, 15]. Two adaptive algorithm to estimate the frequency offset
kinds of sampling rate should be taken into account (FOE) and match the signals using arbitrary
to improve the AEC system, first the sampling rate sampling rate conversion (ASRC). Robledo et al.
of play back audio in PC, second, sampling rate [22] states that sampling rate correction can be
offset of A/D and D/A converter are not exactly achieved efficiently by employing a simple
the same which degrade the performance of echo interpolation procedure in time domain instead of
cancelation system. the conventional approach of up-sampling followed
Stokes and Malvar [16] addressed the effects of by down-sampling.
different sampling rate between microphone and Blind sampling rate offset estimation [23] is
playback audio signal of CD-quality or any other designed for compensation in beam forming
played sound such as 44.1kHz in the PC which is applications. The proposed method utilizes speech-
usually higher than the captured sampling rate absent time segments, where the interference
signal from the microphone. In order to cancel statistics is assumed slowly time-varying, and the
played back audio signal from captured signal by sampling rate offsets are assumed fixed. An
microphone, sampling rate should be converted estimation procedure for the sampling rate offsets is
before it is fed to AEC system. Frequency Domain proposed based on the coherence between the
SRC was proposed in [16] to correct the sampling received signals. Miyabe [24] proposed Short-time
rate of PC play back signal. Nevertheless, several Fourier Transform (STFT) domain to compensate
SRC for audio applications was designed [17, 18], the sampling rate offset, by applying the linear
using either FIR filter or Farrow filter and their phase shift. The effects of mismatch in acoustic
modification. In many cases such audio echo are discussed in next section.
videoconferencing, FIR filter can perform
efficiently with less penalties in terms of time and 4. COMPARISON AND CHALLENGES IN
memory. ACOUSTIC ECHO CANCELLATION
In fact, the worst case occurs when personal SOLUTION
computer with commercial audio hardware is used Basically, Acoustic echo cancellation (AEC) is
for teleconferencing which could result small offset defined as a scheme to remove the echoed signal
of sampling rate between far-end (microphone) that is applied on hands free communication
signal and far end (speaker) signal. According to systems full-duplex. In most AEC systems,
Robledo-Arnuncio et al. [19] sampling rates could adaptive filter is used. Adaptive filter algorithms
vary among the components due to: are widely applied in acoustic echo canceller (AEC)
• Clock signal generators have a certain such as namely Recursive Least Square (RLS) [25]
tolerance in their nominal frequency. filter and Least Mean Square (LMS) filter. The
• A temperature change can affect the operating state-space Kalman Filter is a recursive least square
frequency of devices. error method for estimation of a signal distorted in
• In different devices, the clock signals used for transmission through a channel and observed in
the audio hardware are often obtained by noise. Unlike Kalman Filter that is used to model
applying different division factors to a higher- the dynamics of the signal process, RLS is a
frequency clock. Therefore, the user may not recursive implementation of Wiener filter that is
find the same expected nominal frequencies for used for stationary processes [26]. A relatively
simpler algorithm, LMS, uses the gradient search
71
Journal of Theoretical and Applied Information Technology
th
10 July 2015. Vol.77. No.1
© 2005 - 2015 JATIT & LLS. All rights reserved.
method to search for the least square error filter offset changed and shifted the peak (Figure.2 d, e,
coefficients. f) to the right, thus affecting estimation of the echo
Figure. 1 illustrates the operation of AEC system, path. On the other hand, when the offset was 0 Hz,
where far-end signal x is played out of the speaker. the peak was stable in (Figure.2 a, b, c) and the
The acoustic echo (d) is captured by the adaptive filter will be able to remove the acoustic
microphone, along with the near-end signal (s) and echo signal.
the noise signal (n), the microphone signal is
indicated in Figure. 3 by y [27].
Adaptive filter is used the far-end signal x to
estimate the acoustic echo signal (d) that should be
removed from the near-end signal (s). The
estimated echo signal by filter is subtracted from
the microphone signal and the result (e) which no
longer contains the speaker signal (acoustic echo).
However, two main challenges have been addressed
in designing an AEC system for PCs; first
challenge is discussed in pervious section sampling
rate mismatch of signals at adaptive filter inputs Figure. 3. Effect of sampling rate offset in echo path
[14, 21, 28]. Second challenge is caused by re- [14]
sampling processing delay between far-end and
near-end signals which may degrade the Pawig proposed a framework to tackle of this
performance of AEC system [22, 29]. problem as illustrated in Figure.3. The offset re-
sampling is estimated by comparing the near-end
signal, yc(kTy), and the far-end filtered version,
dc^(tk). The author proposed ASRC to correct the
x
D/A frequency offset. However, the sampling rate
correction is applied for far-end signal which add
more complexity to the system. Thus, every VOIP
a Adaptive
filter
a connection, Pawig framework has new sampling
rate offset ∆f to correct. Moreover, sampling rate
d’ d
e y
estimation achieve high delay to update with the
n
+ A/D s ASRC with far-end sampling rate.
Frequency Domain Acoustic Echo Canceller
(FDAEC) that addresses the problem of sampling
Figure. 2. Basic Operation of Acoustic Echo
rate offset is proposed by Abe [21], utilizing the
Cancellation
concepts proposed in [24] of using STFT to
In this section, the related works is discussed estimate the sampling rate offset. Thus, sampling
according in terms of sampling rate mismatch and rate correction is achieved through two schemes,
misalignment of signals at filter inputs. Pawig [14] frame-step control and phase rotation. The
studied the effects of different sampling rates of the estimation and correction are carried out in a single
D/A converter and the A/D converter of low-quality feedback loop without an external re-sampling filter
PC audio hardware may cause increasing or as it is shown in Figure.4. The designed framework
decreasing delay, which results in lost or repeated in [21] increases the complexity of system which
samples, which in turn affects the adaptive filter make it hard to adopt in real time application which
algorithm for the AEC system and deteriorates echo require low complexity and efficient delay less.
estimation.
Figure.2. illustrates the effect of sampling rate
offset on echo path. In the experiment, Pawig used
two kinds of offsets (∆f = 0 Hz, ∆f = 6 Hz) and
compared between each case by fixing the other
coefficients and setting the echo path length (M =
300) and using the NLMS algorithm with step size
of α (k) = 0.5 and filter length of N = 300. The 6 Hz
72
Journal of Theoretical and Applied Information Technology
th
10 July 2015. Vol.77. No.1
© 2005 - 2015 JATIT & LLS. All rights reserved.
Peak position
adjustment
Figure. 4. Solving The Sampling Rate Problem With
Read x(n)
AEC
pointer
Ratchet FAP
y(n)
Interpolation
+ d(n)
+
e(n) +
Decimation control
Delayless adaptive filter called partitioned block The read pointer for x(n); Coefficients of the
frequency-domain adaptive filter (PBFDAF) adaptive filter, which are shifted one sample to the
proposed in [30] to tackle the issues of delay and left or to the right (depending on the need) with a
complexity of AEC. Delayless PBFDAF eliminates zero appended to the opposite end; The
the input-output delay and have a uniform autocorrelation matrix estimate of the Ratchet FAP
distribution of the computations. The evaluation of adaptive filter. Sums are also shifted and appended
proposed model had been carried only to present accordingly.
the complexity, delay and tracking ability of However, the decimation and interpolation for
proposed methods. More experiment should be re-sampling the near-end signal make using the
taken to enhance the ability of proposed methods to DCAF scheme problematic. It does not work with
cancel the acoustic echo. arbitrary sampling rate signals.
Ding [31] proposed a drift-compensated adaptive Table1 illustrates the differences among the
filtering (DCAF) scheme (Figure. 5). They divided methods that have been proposed to solve the
the proposed scheme into three parts. The first part sampling rate mismatch problem in AEC. Each
consists of timing drift estimation and method uses different techniques to match the input
compensation. The timing drift is dynamically signals to the AEC system, and each method has
estimated by evaluating time averages and advantages and disadvantages.
compensated for by re-sampling the signal d(n) at
the same sampling rate as the signal x(n). The re- 5. OPEN RESEARCH ISSUES
sampling is conducted by up-sampling the signal Nowadays, the AEC systems gaining more
d(n) to factor I and then decimating it by a time-
attention due to increasing of telecommunication
varying factor D(n) ≈ I to get the wanted sampling
applications that enable hands free speaker. In the
rate with a sampling frequency approximately equal other hand, AEC still suffer from different issues
to the x(n) sampling rate. The second part is the
that downgrade its performance in such
Ratchet FAP (Ratchet Fast Affine Projection). Ding
73
Journal of Theoretical and Applied Information Technology
th
10 July 2015. Vol.77. No.1
© 2005 - 2015 JATIT & LLS. All rights reserved.
74
Journal of Theoretical and Applied Information Technology
th
10 July 2015. Vol.77. No.1
© 2005 - 2015 JATIT & LLS. All rights reserved.
75
Journal of Theoretical and Applied Information Technology
th
10 July 2015. Vol.77. No.1
© 2005 - 2015 JATIT & LLS. All rights reserved.
[21] M. Abe and M. Nishiguchi, "Frequency [31] H. Ding and D. I. Havelock, "Drift-
domain acoustic echo canceller that handles compensated adaptive filtering for improving
asynchronous A/D and D/A clocks," in speech intelligibility in cases with
Acoustics, Speech and Signal Processing asynchronous inputs," EURASIP Journal on
(ICASSP), 2014 IEEE International Advances in Signal Processing, vol. 2010, p.
Conference on, 2014, pp. 5924-5928. 95, 2011.
[22] E. Robledo-Arnuncio, T. S. Wada, and J.
Biing-Hwang, "On Dealing with Sampling
Rate Mismatches in Blind Source Separation
and Acoustic Echo Cancellation," in
Applications of Signal Processing to Audio
and Acoustics, 2007 IEEE Workshop on,
2007, pp. 34-37.
[23] S. Markovich-Golan, S. Gannot, and I. Cohen,
"Blind Sampling Rate Offset Estimation and
Compensation in Wireless Acoustic Sensor
Networks with Application to Beamforming,"
in Acoustic Signal Enhancement; Proceedings
of IWAENC 2012; International Workshop on,
2012, pp. 1-4.
[24] S. Miyabe, N. Ono, and S. Makino, "Blind
compensation of inter-channel sampling
frequency mismatch with maximum
likelihood estimation in STFT domain," in
Acoustics, Speech and Signal Processing
(ICASSP), 2013 IEEE International
Conference on, 2013, pp. 674-678.
[25] A. Munjal, V. Aggarwal, and G. Singh, "RLS
algorithm for acoustic echo cancellation," in
Nat. Conf. Challenges and Inform.
Technology COIT, 2008.
[26] N. Kehtarnavaz, Real-time digital signal
processing based on the TMS320C6000:
Newnes, 2005.
[27] G. Schmidt, "Applications of acoustic echo
control: An overview," in Proc. Eur. Signal
Process. Conf.(EUSIPCO’04), Vienna,
Austria, 2004, pp. 9-16.
[28] Q. Li, C. He, and W.-G. Chen, "Challenges
and solutions for designing software AEC on
personal computers," in Proceedings of the
11th International Workshop for Acoustic
Echo and Noise Control (IWAENC'08), 2008.
[29] J. Wung, T. S. Wada, M. Souden, and J.
Biing-Hwang, "Inter-Channel Decorrelation
by Sub-Band Resampling for Multi-Channel
Acoustic Echo Cancellation," Signal
Processing, IEEE Transactions on, vol. 62,
pp. 2127-2142, 2014.
[30] Y. Feiran, W. Ming, and Y. Jun, "A
Computationally Efficient Delayless
Frequency-Domain Adaptive Filter
Algorithm," Circuits and Systems II: Express
Briefs, IEEE Transactions on, vol. 60, pp.
222-226, 2013.
76