default search action
IEEE Transactions on Audio, Speech & Language Processing, Volume 14
Volume 14, Number 1, January 2006
- Lie Lu, Dan Liu, HongJiang Zhang:
Automatic mood detection and tracking of music audio signals. 5-18 - Ning Ma, Martin Bouchard, Rafik A. Goubran:
Speech enhancement using a masking threshold constrained Kalman filter and its heuristic implementations. 19-32 - James D. Gordy, Rafik A. Goubran:
On the perceptual performance limitations of echo cancellers in wideband telephony. 33-42 - Marcus Holmberg, David Gelbart, Werner Hemmert:
Automatic speech recognition with an adaptation model motivated by auditory processing. 43-49 - Thomas Blumensath, Mike E. Davies:
Sparse and shift-Invariant representations of music. 50-57 - Sue Harding, Jon P. Barker, Guy J. Brown:
Mask estimation for missing data speech recognition based on statistics of binaural interaction. 58-67 - Slim Essid, Gaël Richard, Bertrand David:
Instrument recognition in polyphonic music based on automatic taxonomies. 68-80 - Fabian Mörchen, Alfred Ultsch, Michael Thies, Ingo Lohken:
Modeling timbre distance with temporal statistics from polyphonic music. 81-90 - Emmanuel Vincent:
Musical source separation using time-frequency source priors. 91-98 - Mads Græsbøll Christensen, Søren Holdt Jensen:
On perceptual distortion minimization and nonlinear least-squares frequency estimation. 99-109 - Alberto González, Maria de Diego, Miguel Ferrer, Gema Pinero:
Multichannel active noise equalization of interior noise. 110-122 - Yoichi Hinamoto, Hideaki Sakai:
Analysis of the filtered-X LMS algorithm and a related new algorithm for active control of multitonal noise. 123-130 - Norman H. Adams, Mark A. Bartsch, Gregory H. Wakefield:
Note segmentation and quantization for music information retrieval. 131-141 - Norman D. Cook, Takashi X. Fujisawa, Kazuaki Takami:
Evaluation of the affective valence of speech using pitch substructure. 142-151 - Anand D. Subramaniam, William R. Gardner, Bhaskar D. Rao:
Iterative joint source-channel decoding of speech spectrum parameters over an additive white Gaussian noise channel. 152-162 - Sriram Srinivasan, Jonas Samuelsson, W. Bastiaan Kleijn:
Codebook driven short-term predictor parameter estimation for speech enhancement. 163-176 - Yoshifumi Nagata, Toyota Fujioka, Masato Abe:
Speech enhancement based on auto gain control. 177-190 - Laurent Benaroya, Frédéric Bimbot, Rémi Gribonval:
Audio source separation with a single sensor. 191-199 - Kostas Kokkinakis, Asoke K. Nandi:
Multichannel blind deconvolution for source separation in convolutive mixtures of speech. 200-212 - Narendra K. Gupta, Gökhan Tür, Dilek Hakkani-Tür, Srinivas Bangalore, Giuseppe Riccardi, Mazin Gilbert:
The AT&T spoken language understanding system. 213-222 - Ben Milner, Alastair Bruce James:
Robust speech recognition over mobile and IP networks in burst-like packet loss. 223-231 - Ken Chen, Mark Hasegawa-Johnson, Aaron Cohen, Sarah Borys, Sung-Suk Kim, Jennifer Cole, Jeung-Yoon Choi:
Prosody dependent speech recognition on radio news corpus of American English. 232-245 - Néstor Becerra Yoma, Carlos Molina, Jorge F. Silva, Carlos Busso:
Modeling, estimating, and compensating low-bit rate coding distortion in speech recognition. 246-255 - Li Deng, Dong Yu, Alex Acero:
A bidirectional target-filtering model of speech coarticulation and reduction: two-stage implementation for phonetic recognition. 256-265 - Chung-Hsien Wu, Yu-Hsien Chiu, Chi-Jiun Shia, Chun-Yu Lin:
Automatic segmentation and identification of mixed-language speech using delta-BIC and LSA-based GMMs. 266-276 - Tomi Kinnunen, Evgeny Karpov, Pasi Fränti:
Real-time speaker identification and verification. 277-288 - Yang Shao, DeLiang Wang:
Model-based sequential organization in cochannel speech. 289-298 - Christof Faller:
Parametric multichannel audio coding: synthesis of coherence cues. 299-310 - Renat Vafin, W. Bastiaan Kleijn:
Rate-distortion optimized quantization in multistage audio coding. 311-320 - Antti J. Eronen, Vesa T. Peltonen, Juha T. Tuomi, Anssi Klapuri, Seppo Fagerlund, Timo Sorsa, Gaëtan Lorho, Jyri Huopaniemi:
Audio-based context recognition. 321-329 - Wei-Ho Tsai, Hsin-Min Wang:
Automatic singer recognition of popular music recordings via estimation and modeling of solo vocal signals. 330-341 - Anssi Klapuri, Antti J. Eronen, Jaakko Astola:
Analysis of the meter of acoustic musical signals. 342-355 - Vaibhava Goel, Shankar Kumar, William Byrne:
Corrections to "Segmental minimum Bayes-risk decoding for automatic speech recognition". 356-357
Volume 14, Number 2, March 2006
- Satoshi Nakamura, Konstantin Markov, Hiromi Nakaiwa, Gen-ichiro Kikui, Hisashi Kawai, Takatoshi Jitsuhiro, Jinsong Zhang, Hirofumi Yamamoto, Eiichiro Sumita, Seiichi Yamamoto:
The ATR multilingual speech-to-speech translation system. 365-376 - Liang Gu, Yuqing Gao, Fu-Hua Liu, Michael Picheny:
Concept-based speech-to-speech translation using maximum entropy models for statistical natural concept generation. 377-392 - Yasuhiro Akiba, Kenji Imamura, Eiichiro Sumita, Hiromi Nakaiwa, Shun'ichi Yamamoto, Hiroshi G. Okuno:
Using multiple edit distances to automatically grade outputs from Machine translation systems. 393-402 - Tanja Schultz, Alan W. Black, Stephan Vogel, Monika Woszczyna:
Flexible speech translation systems. 403-411 - Alan Davis, Sven Nordholm, Roberto Togneri:
Statistical voice activity detection using low-variance spectrum estimation and an adaptive threshold. 412-424 - Li Deng, Alex Acero, Issam Bazzi:
Tracking vocal tract resonances using a quantized nonlinear function embeddedin a temporal constraint. 425-434 - Kamran Mustafa, Ian C. Bruce:
Robust formant tracking for continuous speech with speaker variability. 435-444 - Huiqun Deng, Rabab K. Ward, Michael P. Beddoes, Murray Hodgson:
A new method for obtaining accurate estimates of vocal-tract filters and glottal waves from vowel sounds. 445-455 - Mike Brookes, Patrick A. Naylor, Jón Guðnason:
A quantitative assessment of group delay methods for identifying glottal closures in voiced speech. 456-466 - Ran D. Zilca, Brian Kingsbury, Jirí Navrátil, Ganesh N. Ramaswamy:
Pseudo pitch synchronous analysis of speech with applications to speaker recognition. 467-478 - Saeed Gazor, Reza Rashidi Far:
Adaptive maximum windowed likelihood multicomponent AM-FM signal decomposition. 479-491 - Qiang Fu, Peter Murphy:
Robust glottal source estimation based on joint source-filter model optimization. 492-501 - Etan Fisher, Joseph Tabrikian, Shlomo Dubnov:
Generalized likelihood ratio test for voiced-unvoiced decision in noisy speech using the harmonic model. 502-510 - Doroteo T. Toledano, Jesús Gómez Villardebó, Luis A. Hernández Gómez:
Initialization, training, and context-dependency in HMM-based formant tracking. 511-523 - Anand D. Subramaniam, William R. Gardner, Bhaskar D. Rao:
Low-complexity source coding using Gaussian mixture models, lattice vector quantization, and recursive coding with application to speech spectrum quantization. 524-532 - Thomas F. Quatieri, Kevin Brady, D. Messing, Joseph P. Campbell, William M. Campbell, Michael S. Brandstein, Clifford J. Weinstein, John D. Tardelli, Paul D. Gatewood:
Exploiting nonacoustic sensors for speech encoding. 533-544 - Hui Dong, Jerry D. Gibson:
Structures for SNR scalable speech coding. 545-557 - Udaya Bhaskar, Kumar Swaminathan:
Low bit-rate voice compression based on frequency domain interpolative techniques. 558-576 - Harald Gustafsson, Ulf A. Lindgren, Ingvar Claesson:
Low-complexity feature-mapped speech bandwidth extension. 577-588 - Olivier Pietquin, Thierry Dutoit:
A probabilistic framework for dialog simulation and optimal strategy learning. 589-599 - Bojana Gajic, Kuldip K. Paliwal:
Robust speech recognition in noisy environments based on subband spectral centroid histograms. 600-608 - Hossein Najaf-Zadeh, Peter Kabal:
Perceptual coding of narrow-band audio signals at low rates. 609-622 - Ashish Aggarwal, Shankar L. Regunathan, Kenneth Rose:
A trellis-based optimal parameter value selection for audio coding. 623-633 - Pongtep Angkititrakul, John H. L. Hansen:
Advances in phone-based modeling for automatic accent classification. 634-646 - Chung-Hsien Wu, Chia-Hsin Hsieh:
Multiple change-point audio segmentation and classification using an MDL-based Gaussian model. 647-657 - Ngwa A. Shusina, Boaz Rafaely:
Unbiased adaptive feedback cancellation in hearing aids by closed-loop identification. 658-665 - Hiroshi Saruwatari, Toshiya Kawamura, Tsuyoki Nishikawa, Akinobu Lee, Kiyohiro Shikano:
Blind source separation based on a fast-convergence algorithm combining ICA and beamforming. 666-678 - Ali Taylan Cemgil, Hilbert J. Kappen, David Barber:
A generative model for music transcription. 679-694 - Mitsuko Aramaki, Richard Kronland-Martinet:
Analysis-synthesis of impact sounds by real-time dynamic filtering. 695-705 - Kelvin Chee-Mun Lee, Woon-Seng Gan:
Bandwidth-efficient recursive pth-order equalization for correcting baseband distortion in parametric loudspeakers. 706-710 - L. E. Rees, Stephen J. Elliott:
Adaptive algorithms for active sound-profiling. 711-719 - Muhammad Tahir Akhtar, Masahide Abe, Masayuki Kawamata:
A new variable step size LMS algorithm-based method for improved online secondary path modeling in active noise control systems. 720-726 - Thomas Hain, Philip C. Woodland, Gunnar Evermann, Mark J. F. Gales, Xunying Liu, Gareth L. Moore, Daniel Povey, Lan Wang:
Corrections to "Automatic Transcription of Conversational Telephone Speech". 727-727
Volume 14, Number 3, May 2006
- S. Ramamohan, Samarendra Dandapat:
Sinusoidal model-based analysis and classification of stressed speech. 737-746 - Joon-Hyuk Chang, Nam Soo Kim:
A new structural approach in system identification with generalized analysis-by-synthesis for robust speech coding. 747-751 - Christoffer Asgaard Rødbro, Jesper Jensen, Richard Heusdens:
Rate-distortion optimal time-segmentation and redundancy selection for VoIP. 752-763 - Volodya Grancharov, Jonas Samuelsson, W. Bastiaan Kleijn:
On causal algorithms for speech enhancement. 764-773 - Mingyang Wu, DeLiang Wang:
A two-stage algorithm for one-microphone reverberant speech enhancement. 774-784 - Andy W. H. Khong, Patrick A. Naylor:
Stereophonic acoustic echo cancellation employing selective-tap adaptive algorithms. 785-796 - Jen-Tzung Chien, Chih-Hsien Huang:
Aggregate a posteriori linear regression adaptation. 797-807 - Jeih-Weih Hung, Lin-Shan Lee:
Optimization of temporal filters for constructing robust features in speech recognition. 808-832 - Ji Ming:
Noise compensation for speech recognition with arbitrary additive noise. 833-844 - Florian Hilger, Hermann Ney:
Quantile based histogram equalization for noise robust large vocabulary speech recognition. 845-854 - Shinji Watanabe, Atsushi Sako, Atsushi Nakamura:
Automatic determination of acoustic model topology using variational Bayesian estimation and clustering for large vocabulary continuous speech recognition. 855-872 - Hong-Kwang Jeff Kuo, Yuqing Gao:
Maximum entropy direct models for speech recognition. 873-881 - Khe Chai Sim, Mark J. F. Gales:
Minimum phone error training of precision matrix models. 882-889 - Jorge F. Silva, Shrikanth S. Narayanan:
Average divergence distance as a statistical discrimination measure for hidden Markov models. 890-906 - Rongqing Huang, John H. L. Hansen:
Advances in unsupervised audio classification and segmentation for the broadcast news and NGSW corpora. 907-919 - Nima Mesgarani, Malcolm Slaney, Shihab A. Shamma:
Discrimination of speech from nonspeech based on multiscale spectro-temporal Modulations. 920-930 - R. Sant'Ana, Rosangela Coelho, Abraham Alcaim:
Text-independent speaker recognition based on the Hurst parameter and the multidimensional fractional Brownian motion model. 931-940 - Enrique Vidal, Francisco Casacuberta, Luis Rodríguez, Jorge Civera, Carlos D. Martínez-Hinarejos:
Computer-assisted translation using speech recognition. 941-951 - Athanasios Mouchtaris, Jan Van der Spiegel, Paul Mueller:
Nonparallel training for voice conversion based on a parameter adaptation approach. 952-963 - Jack Mullen, David M. Howard, Damian T. Murphy:
Waveguide physical modeling of vocal tract acoustics: flexible formant bandwidth control from increased model dimensionality. 964-971 - K. Sreenivasa Rao, B. Yegnanarayana:
Prosody modification using instants of significant excitation. 972-980 - Ki-Seung Lee:
MLP-based phone boundary refining for a TTS database. 981-989 - Jerome R. Bellegarda:
A global, boundary-centric framework for unit selection text-to-speech synthesis. 990-997 - Cheng-Han Yang, Hsueh-Ming Hang:
Cascaded trellis-based rate-distortion control algorithm for MPEG-4 advanced audio coding. 998-1007 - Ben Supper, Tim Brookes, Francis Rumsey:
An auditory onset detection algorithm for improved automatic source localization. 1008-1017 - Woon-Seng Gan, Jun Yang, Khim Sia Tan, Meng Hwa Er:
A digital beamsteerer for difference frequency in a parametric array. 1018-1025 - Rui Cai, Lie Lu, Alan Hanjalic, HongJiang Zhang, Lian-Hong Cai:
A flexible framework for key audio effects detection and auditory context inference. 1026-1039 - Dimitrios K. Fragoulis, Constantin Papaodysseus, Mihalis Exarhos, George Roussopoulos, Thanasis Panagopoulos, Dimitrios Kamarotos:
Automated classification of piano-guitar notes. 1040-1050 - Harald Viste, Gianpaolo Evangelista:
A method for separation of overlapping partials based on similarity of temporal envelopes in multichannel mixtures. 1051-1061 - Serkan Kiranyaz, Ahmad Farooq Qureshi, Moncef Gabbouj:
A generic audio classification and segmentation approach for multimedia indexing and retrieval. 1062-1081 - Timothy J. Hazen:
Visual model structures and synchrony constraints for audio-visual speech recognition. 1082-1089
Volume 14, Number 4, July 2006
- John F. Pitrelli, Raimo Bakis, Ellen Eide, Raul Fernandez, Wael Hamza, Michael A. Picheny:
The IBM expressive text-to-speech synthesis system for American English. 1099-1108 - Chung-Hsien Wu, Chi-Chun Hsia, Te-Hsien Liu, Jhing-Fa Wang:
Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis. 1109-1116 - Marc Schröder:
Expressing degree of activation in synthetic speech. 1128-1136 - Mariët Theune, K. Meijs, Dirk Heylen, Roeland Ordelman:
Generating expressive speech for storytelling applications. 1137-1144 - Jianhua Tao, Yongguo Kang, Aijun Li:
Prosody conversion from neutral speech to emotional speech. 1145-1154 - Wentao Gu, Keikichi Hirose, Hiroya Fujisaki:
Modeling the effects of emphasis and question on fundamental frequency contours of Cantonese utterances. 1155-1170 - N. Campbell:
Conversational speech synthesis and the need for some laughter. 1171-1178 - Taishih Chi, Shihab A. Shamma:
Spectrum restoration from multiscale auditory phase singularities by generalized projections. 1179-1192 - Akira Watanabe, Tadashi Sakata:
Reliable methods for estimating relative vocal tract lengths from formant trajectories of common words. 1193-1204 - W. C. Chu:
Embedded quantization of line spectral frequencies using a multistage tree-structured vector quantizer. 1205-1217 - Jingdong Chen, Jacob Benesty, Yiteng Arden Huang, Simon Doclo:
New insights into the noise reduction Wiener filter. 1218-1234 - Yunxin Zhao, Rong Hu, Xiaolong Li:
Speedup convergence and reduce noise for enhanced speech separation and recognition. 1235-1244 - Jen-Tzung Chien, Bo-Cheng Chen:
A new independent component analysis for speech recognition and separation. 1245-1254 - Satya Dharanipragada, Karthik Visweswariah:
Gaussian mixture models with covariances or precisions in shared multiple subspaces. 1255-1266 - Brian Kan-Wing Mak, Roger Wend-Huu Hsiao, Simon Ka-Lung Ho, James T. Kwok:
Embedded kernel eigenvoice speaker adaptation and its implication to reference speaker weighting. 1267-1280 - Diamantino Caseiro, Isabel Trancoso:
A specialized on-the-fly algorithm for lexicon and language model composition. 1281-1291 - Toshihiko Abe, Masaaki Honda:
Sinusoidal model based on instantaneous frequency attractors. 1292-1300 - Hui Ye, Steve J. Young:
Quality-enhanced voice morphing using maximum likelihood transformations. 1301-1312 - Ashish Aggarwal, Shankar L. Regunathan, Kenneth Rose:
Efficient bit-rate scalability for weighted squared error optimization in audio coding. 1313-1327 - Olivier Derrien, Pierre Duhamel, Maurice Charbit, Gaël Richard:
A new quantization optimization algorithm for the MPEG advanced audio coder using a statistical subband model of the quantization noise. 1328-1339 - Mads Græsbøll Christensen, Steven van de Par:
Efficient parametric coding of transients. 1340-1351 - Rongshan Yu, Susanto Rahardja, Xiao Lin, Chi Chung Ko:
A fine granular scalable to lossless audio coder. 1352-1363 - T. Umayahara, Haruhide Hokari, Shoji Shimada:
Stereo width control using interpolation and extrapolation of time-frequency representation. 1364-1377 - Fotios Talantzis, Darren B. Ward, Patrick A. Naylor:
Performance analysis of dynamic acoustic source separation in reverberant rooms. 1378-1390 - Paulo A. A. Esquef, Luiz W. P. Biscainho:
An efficient model-based multirate method for reconstruction of audio signals across long gaps. 1391-1400 - Slim Essid, Gaël Richard, Bertrand David:
Musical instrument recognition by pairwise classification strategies. 1401-1412 - Ixone Arroabarren, Xavier Rodet, Alfonso Carlosena:
On the measurement of the instantaneous frequency and amplitude of partials in vocal vibrato. 1413-1421 - Ixone Arroabarren, Alfonso Carlosena:
Inverse filtering in singing voice: a critical analysis. 1422-1431 - Saman S. Abeysekera, Kabi Prakash Padhi:
An investigation of window effects on the frequency estimation using the phase vocoder. 1432-1439 - Axel Röbel:
Adaptive additive modeling with continuous parameter trajectories. 1440-1453 - Crispin H. V. Cooper, Damian T. Murphy, David M. Howard, Alexander Tyrrell:
Singing synthesis with an evolved physical model. 1454-1461 - Emmanuel Vincent, Rémi Gribonval, Cédric Févotte:
Performance measurement in blind audio source separation. 1462-1469 - Panayiotis G. Georgiou, Chris Kyriakakis:
Robust maximum likelihood source localization: the case for sub-Gaussian versus Gaussian. 1470-1480
Volume 14, Number 5, September 2006
- Li Deng, Dong Yu, Alex Acero:
Structured speech modeling. 1492-1504 - Claude Barras, Xuan Zhu, Sylvain Meignier, Jean-Luc Gauvain:
Multistage speaker diarization of broadcast news. 1505-1512 - Mark J. F. Gales, Do Yeong Kim, Philip C. Woodland, Ho Yin Chan, David Mrva, Rohit Sinha, S. E. Tranter:
Progress in the CU-HTK broadcast news transcription system. 1513-1525 - Yang Liu, Elizabeth Shriberg, Andreas Stolcke, Dustin Hillard, Mari Ostendorf, Mary P. Harper:
Enriching speech recognition with automatic detection of sentence boundaries and disfluencies. 1526-1540 - Spyridon Matsoukas, Jean-Luc Gauvain, Gilles Adda, Thomas Colthurst, Chia-Lin Kao, Owen Kimball, Lori Lamel, Fabrice Lefèvre, Jeff Z. Ma, John Makhoul, Long Nguyen, Rohit Prasad, Richard M. Schwartz, Holger Schwenk, Bing Xiang:
Advances in transcription of broadcast news and conversational telephone speech within the combined EARS BBN/LIMSI system. 1541-1556 - S. E. Tranter, Douglas A. Reynolds:
An overview of automatic speaker diarization systems. 1557-1565 - Matthew Lease, Mark Johnson, Eugene Charniak:
Recognizing disfluencies in conversational speech. 1566-1573 - Jui-Feng Yeh, Chung-Hsien Wu:
Edit disfluency detection and correction using a cleanup language model and an alignment model. 1574-1583 - Hui Jiang, Xinwei Li, Chaojun Liu:
Large margin hidden Markov models for speech recognition. 1584-1595 - Stanley F. Chen, Brian Kingsbury, Lidia Mangu, Daniel Povey, George Saon, Hagen Soltau, Geoffrey Zweig:
Advances in speech transcription at IBM under the DARPA EARS program. 1596-1608 - Christoffer Asgaard Rødbro, Manohar N. Murthi, Søren Vang Andersen, Søren Holdt Jensen:
Hidden Markov model-based packet loss concealment for voice over IP. 1609-1623 - Farshad Lahouti, Ahmad R. Fazel, A. H. Safavi-Naeini, Amir K. Khandani:
Single and double frame coding of speech LPC parameters using a lattice-based quantization scheme. 1624-1632 - Herbert Buchner, Jacob Benesty, Tomas Gänsler, Walter Kellermann:
Robust extended multidelay filter and double-talk detector for acoustic echo cancellation. 1633-1644 - Marcin Kuropatwinski, W. Bastiaan Kleijn:
Estimation of the short-term predictor parameters of speech under noisy conditions. 1645-1655 - Rile Hu, Chengqing Zong, Bo Xu:
An approach to automatic acquisition of translation templates based on phrase structure extraction and alignment. 1656-1663 - Alon Lavie, Fabio Pianesi, Lori S. Levin:
The NESPOLE! System for multilingual speech communication over the Internet. 1664-1673 - Gen-ichiro Kikui, Seiichi Yamamoto, Toshiyuki Takezawa, Eiichiro Sumita:
Comparative study on corpora for speech translation. 1674-1682 - Xiao Li, Jonathan Malkin, Jeff A. Bilmes:
A high-speed, low-resource ASR back-end based on custom arithmetic. 1683-1693 - Kai Yu, Mark J. F. Gales:
Discriminative cluster adaptive training. 1694-1703 - Chak-Fai Li, Man-Hung Siu, Jeff Siu-Kei Au-Yeung:
Recursive likelihood evaluation and fast search algorithm for polynomial segment model with application to speech recognition. 1704-1718 - Jen-Tzung Chien:
Association pattern language modeling. 1719-1728 - Andreas Stolcke, Barry Y. Chen, Horacio Franco, Venkata Ramana Rao Gadde, Martin Graciarena, Mei-Yuh Hwang, Katrin Kirchhoff, Arindam Mandal, Nelson Morgan, Xin Lei, Tim Ng, Mari Ostendorf, M. Kemal Sönmez, Anand Venkataraman, Dimitra Vergyri, Wen Wang, Jing Zheng, Qifeng Zhu:
Recent innovations in speech-to-text transcription at SRI-ICSI-UW. 1729-1744 - Nicolae Duta, Richard M. Schwartz, John Makhoul:
Analysis of the errors produced by the 2004 BBN speech recognition system in the DARPA EARS evaluations. 1745-1753 - Siddharth Mathur, Brad H. Story, Jeffrey J. Rodríguez:
Vocal-tract modeling: fractional elongation of segment lengths in a waveguide model with half-sample delays. 1754-1762 - Jithendra Vepa, Simon King:
Subjective evaluation of join cost and smoothing methods for unit selection speech synthesis. 1763-1771 - Cléo Baras, Nicolas Moreau, Przemyslaw Dymarski:
Controlling the inaudibility and maximizing the robustness in an audio annotation watermarking system. 1772-1782 - Masataka Goto:
A chorus section detection method for musical audio signals and its application to a music listening station. 1783-1794 - Aggelos Pikrakis, Sergios Theodoridis, Dimitris Kamarotos:
Classification of musical patterns using variable duration hidden Markov models. 1795-1807 - Laurent Daudet:
Sparse and structured decompositions of signals with the molecular matching pursuit. 1808-1816 - Vincent Verfaille, Udo Zölzer, Daniel Arfib:
Adaptive digital audio effects (a-DAFx): a new class of sound transformations. 1817-1831 - Fabien Gouyon, Anssi Klapuri, Simon Dixon, M. Alonso, George Tzanetakis, C. Uhle, Pedro Cano:
An experimental comparison of audio tempo induction algorithms. 1832-1844 - Mark R. Every, John E. Szymanski:
Separation of synchronous pitched notes by spectral filtering of harmonics. 1845-1856 - Sen M. Kuo, Ajay B. Puvvala:
Effects of frequency separation in periodic active noise control systems. 1857-1866 - Guangji Shi, Maryam Modir Shanechi, Parham Aarabi:
On the importance of phase in human speech recognition. 1867-1874 - Debi Prasad Das, Swagat Ranjan Mohapatra, Aurobinda Routray, Tapan Kumar Basu:
Filtered-s LMS algorithm for multichannel active control of nonlinear noise processes. 1875-1880
Volume 14, Number 6, November 2006
- Antony W. Rix, John G. Beerends, Doh-Suk Kim, Peter Kroon, Oded Ghitza:
Objective Assessment of Speech and Audio Quality - Technology and Applications. 1890-1901 - Rainer Huber, Birger Kollmeier:
PEMO-Q - A New Method for Objective Audio Quality Assessment Using a Model of Auditory Perception. 1902-1911 - Abhijit Karmakar, Arun Kumar, R. K. Patney:
A Multiresolution Model of Auditory Excitation Pattern and Its Application to Objective Evaluation of Perceived Speech Quality. 1912-1923 - Ludovic Malfait, Jens Berger, Martin Kastner:
P.563 - The ITU-T Standard for Single-Ended Speech Quality Assessment. 1924-1934 - Tiago H. Falk, Wai-Yip Chan:
Single-Ended Speech Quality Measurement Using Machine Learning Methods. 1935-1947 - Volodya Grancharov, David Yuheng Zhao, Jonas Lindblom, W. Bastiaan Kleijn:
Low-Complexity, Nonintrusive Speech Quality Assessment. 1948-1956 - Alexander Raake:
Short- and Long-Term Packet Loss Behavior: Towards Speech Quality Prediction for Arbitrary Loss Distributions. 1957-1968 - Sebastian Möller, Alexander Raake, Nobuhiko Kitawaki, Akira Takahashi, Marcel Wältermann:
Impairment Factor Framework for Wide-Band Speech Codecs. 1969-1976 - S. R. Broom:
VoIP Quality Assessment: Taking Account of the Edge-Device. 1977-1983 - Akira Takahashi, Atsuko Kurashima, Hideaki Yoshino:
Objective Assessment Methodology for Estimating Conversational Quality in VoIP. 1984-1993 - Sunish George, Slawomir K. Zielinski, Francis Rumsey:
Feature Extraction for the Prediction of Multichannel Spatial Audio Fidelity. 1994-2005 - Takeshi Yamada, Masakazu Kumakura, Nobuhiko Kitawaki:
Performance Estimation of Speech Recognition System Under Noise Conditions Using Objective Quality Measures and Artificial Voice. 2006-2013 - Peng Li, Yong Guan, Bo Xu, Wenju Liu:
Monaural Speech Separation Based on Computational Auditory Scene Analysis and Objective Quality Assessment of Speech. 2014-2023 - Georgios Evangelopoulos, Petros Maragos:
Multiband Modulation Energy Tracking for Noisy Speech Detection. 2024-2038 - Mohamed A. Deriche, Daryl Ning:
A Novel Audio Coding Scheme Using Warped Linear Prediction Model and the Discrete Wavelet Transform. 2039-2048 - John H. L. Hansen, V. Radhakrishnan, Kathryn Hoberg Arehart:
Speech Enhancement Based on Generalized Minimum Mean Square Error Estimators and Masking Properties of the Auditory System. 2049-2063 - Richard C. Hendriks, Richard Heusdens, Jesper Jensen:
Adaptive Time Segmentation for Improved Speech Enhancement. 2064-2074 - Tiemin Mei, Jiangtao Xi, Fuliang Yin, Alfred Mertins, Joe F. Chicharo:
Blind Source Separation Based on Time-Domain Optimization of a Frequency-Domain Independence Criterion. 2075-2085 - Yoshifumi Nagata, K. Mitsubori, T. Kagi, Toyota Fujioka, Masato Abe:
Fast Implementation of KLT-Based Speech Enhancement Using Vector Quantization. 2086-2097 - Cyril Plapous, Claude Marro, Pascal Scalart:
Improved Signal-to-Noise Ratio Estimation for Speech Enhancement. 2098-2108 - Michael L. Seltzer, Richard M. Stern:
Subband Likelihood-Maximizing Beamforming for Speech Recognition in Reverberant Environments. 2109-2121 - Man-Hung Siu, Arthur Chan:
A Robust Viterbi Algorithm Against Impulsive Noise With Application to Speech Recognition. 2122-2133 - Ye Tian, Jian-Lai Zhou, Hui Lin, Hui Jiang:
Tree-Based Covariance Modeling of Hidden Markov Models. 2134-2146 - Jian Wu, Qiang Huo:
An Environment-Compensated Minimum Classification Error Training Approach Based on Stochastic Vector Mapping. 2147-2155 - Kevin W. Wilson, Trevor Darrell:
Learning a Precedence Effect-Like Weighting Function for the Generalized Cross-Correlation Framework. 2156-2164 - Hiroshi Sawada, Shoko Araki, Ryo Mukai, Shoji Makino:
Blind Extraction of Dominant Target Sources Using ICA and Time-Frequency Masking. 2165-2173 - Cédric Févotte, Simon J. Godsill:
A Bayesian Approach for Blind Separation of Sparse Sources. 2174-2188 - Yegui Xiao, Liying Ma, Khashayar Khorasani, Akira Ikuta:
A New Robust Narrowband Active Noise Control System in the Presence of Frequency Mismatch. 2189-2200 - Yoshikazu Yokotani, Ralf Geiger, G. D. T. Schuller, Soontorn Oraintara, K. R. Rao:
Lossless Audio Coding Using the IntMDCT and Rounding Error Shaping. 2201-2211 - Toshio Irino, Roy D. Patterson, Hideki Kawahara:
Speech Segregation Using an Auditory Vocoder With Event-Synchronous Enhancements. 2212-2221 - Toshio Irino, Roy D. Patterson:
A Dynamic Compressive Gammachirp Auditory Filterbank. 2222-2232 - Wei-Chen Chang, Alvin Wen-Yu Su:
A Multichannel Recurrent Network Analysis/Synthesis Model for Coupled-String Instruments. 2233-2241 - Juan Pablo Bello, Laurent Daudet, Mark B. Sandler:
Automatic Piano Transcription Using Frequency and Time-Domain Information. 2242-2251 - Panu Somervuo, Aki Härmä, Seppo Fagerlund:
Parametric Representations of Bird Sounds for Automatic Species Recognition. 2252-2263 - Ying Li, Chitra Dorai:
Instructional Video Content Analysis Using Audio Information. 2264-2274
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.