-
Notifications
You must be signed in to change notification settings - Fork 10
aud.filterbank
This operator is a specialisation of the general signal processing operator sig.filterbank focused on auditory models. The filterbank decomposition models an actual process of human perception, corresponding to the distribution of frequencies into critical bands in the cochlea.
Same as in sig.filterbank.
Two basic types of filterbanks are proposed:
aud.filterbank(…,'Gammatone') carries out a Gammatone filterbank decomposition (Patterson et al, 1992). It is known to simulate well the response of the basilar membrane. It is based on a Equivalent Rectangular Bandwidth (ERB) filterbank, meaning that the width of each band is determined by a particular psychoacoustical law. For Gammatone filterbanks, sig.filterbank calls the Auditory Toolbox routines MakeERBFilters and ERBfilterbank. This is the default choice when calling aud.filterbank.
Ten ERB filters between 100 and 8000Hz (Slaney, 1998)
-
aud.filterbank(...,'Lowest',f)indicates the lowest frequency f, in Hz. Default value: 50 Hz.
aud.filterbank(…,'2Channels') performs a computational simplification of the filterbank using just two channels, one for low-frequencies, below 1000 Hz, and one for high-frequencies, over 1000 Hz (Tolonen and Karjalainen, 2000). On the high-frequency channel is performed an envelope extraction using a half-wave rectification and the same low-pass filter used for the low-frequency channel. This filterbank is mainly used for multi-pitch extraction (cf. aud.pitch).
-
aud.filterbank(…,'NbChannels',N)specifies the number of channels in the bank. By default: N = 10. This option is useless for'2Channels'. -
aud.filterbank(…,'Channel',c)– oraud.filterbank(…,'Channels',c)– only output the channels whose ranks are indicated in the array c (default: c = (1:N) )
sig.filterbank(…,p) specifies predefined filterbanks, all implemented using elliptic filters, by default of order 4:
-
p =
'Mel': Mel scale (cf.aud.spectrum(…,'Mel')). -
p =
'Bark': Bark scale (cf.aud.spectrum(…,'Bark')). -
p =
'Scheirer'proposed in (Scheirer, 1998) corresponds tosig.filterbank(…,'CutOff',[-Inf 200 400 800 1600 3200 Inf]) -
p =
'Klapuri'proposed in (Klapuri, 1999) corresponds tosig.filterbank(…,'CutOff',44*[2.^ ([ 0:2, ( 9+(0:17) )/3 ]) ])
aud.filterbank('ragtime.wav')
If the number of channels exceeds 20, the audio waveform decomposition is represented as a single image bitmap, where each line of pixel represents each successive channel:
aud.filterbank('ragtime.wav','NbChannels',40)