default search action
IEEE Transactions on Audio, Speech & Language Processing, Volume 20
Volume 20, Number 1, 2012
- Helen Meng:
Farewell Editorial. 1 - Li Deng:
Inaugural Editorial: Riding the Tidal Wave of Human-Centric Information Processing - Innovate, Outreach, Collaborate, Connect, Expand, and Win. 2-3 - Dong Yu, Geoffrey E. Hinton, Nelson Morgan, Jen-Tzung Chien, Shigeki Sagayama:
Introduction to the Special Section on Deep Learning for Speech and Language Processing. 4-6 - Nelson Morgan:
Deep and Wide: Multiple Layers in Automatic Speech Recognition. 7-13 - Abdel-rahman Mohamed, George E. Dahl, Geoffrey E. Hinton:
Acoustic Modeling Using Deep Belief Networks. 14-22 - Garimella S. V. S. Sivaram, Hynek Hermansky:
Sparse Multilayer Perceptron for Phoneme Recognition. 23-29 - George E. Dahl, Dong Yu, Li Deng, Alex Acero:
Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition. 30-42 - George Saon, Jen-Tzung Chien:
Bayesian Sensing Hidden Markov Models. 43-54 - Jen-Tzung Chien, Chuang-Hua Chueh:
Topic-Based Hierarchical Segmentation. 55-66 - I. Yücel Özbek, Mark Hasegawa-Johnson, Mübeccel Demirekler:
On Improving Dynamic State Space Approaches to Articulatory Inversion With MAP-Based Parameter Estimation. 67-81 - Mark R. P. Thomas, Jón Guðnason, Patrick A. Naylor:
Estimation of Glottal Closing and Opening Instants in Voiced Speech Using the YAGA Algorithm. 82-91 - Jesper Jensen, Richard C. Hendriks:
Spectral Magnitude Minimum Mean-Square Error Estimation Using Binary and Continuous Gain Functions. 92-102 - Hen-Geul Yeh, Carlos Rangel Ruiz:
Fixed-Point Implementation of Cascaded Forward-Backward Adaptive Predictors. 103-107 - Tobias May, Steven van de Par, Armin Kohlrausch:
Noise-Robust Speaker Recognition Combining Missing Data Techniques and Universal Background Modeling. 108-121 - Alberto Carini, Stefania Cecchi, Francesco Piazza, Ivan Omiciuolo, Giovanni L. Sicuranza:
Multiple Position Room Response Equalization in Frequency Domain. 122-135 - Iman S. Mossavat, Petko Nikolov Petkov, W. Bastiaan Kleijn, Oliver Amft:
A Hierarchical Bayesian Approach to Modeling Heterogeneity in Speech Quality Assessment. 136-146 - Thomas Ulrich Christiansen, Steven Greenberg:
Perceptual Confusions Among Consonants, Revisited - Cross-Spectral Integration of Phonetic-Feature Information and Consonant Recognition. 147-161 - Enzo De Sena, Hüseyin Hacihabiboglu, Zoran Cvetkovic:
On the Design and Implementation of Higher Order Differential Microphones. 162-174 - Ted S. Wada, Biing-Hwang Juang:
Enhancement of Residual Echo for Robust Acoustic Echo Cancellation. 175-189 - Adam M. Stark, Mark D. Plumbley:
Performance Following: Real-Time Prediction of Musical Sequences Without a Score. 190-199 - Matthias Mauch, Hiromasa Fujihara, Masataka Goto:
Integrating Additional Chord Information Into HMM-Based Lyrics-to-Audio Alignment. 200-210 - Berlin Chen, Shih-Hsiang Lin:
A Risk-Aware Modeling Framework for Speech Summarization. 211-222 - Richard C. Hendriks, Timo Gerkmann:
Noise Correlation Matrix Estimation for Multi-Microphone Speech Enhancement. 223-233 - Giovanni L. Sicuranza, Alberto Carini:
On the BIBO Stability Condition of Adaptive Recursive FLANN Filters With Application to Nonlinear Active Noise Control. 234-245 - Francesco Nesta, Maurizio Omologo:
Generalized State Coherence Transform for Multidimensional TDOA Estimation of Multiple Sources. 246-260 - Yasmín Montenegro M., José Carlos M. Bermudez:
Transient Mean-Square Analysis of Prediction Error Method-Based Adaptive Feedback Cancellation in Hearing Aids. 261-275 - Lei Xie, Lilei Zheng, Zihan Liu, Yanning Zhang:
Laplacian Eigenmaps for Automatic Story Segmentation of Broadcast News. 276-289 - Norberto Degara, Enrique Argones-Rúa, Antonio S. Pena, Soledad Torres-Guijarro, Matthew E. P. Davies, Mark D. Plumbley:
Reliability-Informed Beat Tracking of Musical Signals. 290-301 - Jen-Tzung Chien, Hsin-Lung Hsieh:
Convex Divergence ICA for Blind Source Separation. 302-313 - Hangil Moon:
A Low-Complexity Design for an MP3 Multi-Channel Audio Decoding System. 314-321 - Celia Shahnaz, Wei-Ping Zhu, M. Omair Ahmad:
Pitch Estimation Based on a Harmonic Sinusoidal Autocorrelation Model and a Time-Domain Matching Scheme. 322-335 - Claudio Garretón, Néstor Becerra Yoma:
Telephone Channel Compensation in Speaker Verification Using a Polynomial Approximation in the Log-Filter-Bank Energy Domain. 336-341 - Vishweshwara Rao, Pradeep Gaddipati, Preeti Rao:
Signal-Driven Window-Length Adaptation for Sinusoid Detection in Polyphonic Music. 342-348
Volume 20, Number 2, February 2012
- Xavier Anguera Miró, Simon Bozonnet, Nicholas W. D. Evans, Corinne Fredouille, Gerald Friedland, Oriol Vinyals:
Speaker Diarization: A Review of Recent Research. 356-370 - Gerald Friedland, Adam Janin, David Imseng, Xavier Anguera Miró, Luke R. Gottlieb, Marijn Huijbregts, Mary Tai Knox, Oriol Vinyals:
The ICSI RT-09 Speaker Diarization System. 371-381 - Nicholas W. D. Evans, Simon Bozonnet, Dong Wang, Corinne Fredouille, Raphaël Troncy:
A Comparative Study of Bottom-Up and Top-Down Approaches to Speaker Diarization. 382-392 - Marijn Huijbregts, David A. van Leeuwen, Chuck Wooters:
Speaker Diarization Error Analysis Using Oracle Components. 393-403 - Marijn Huijbregts, David A. van Leeuwen:
Large-Scale Speaker Diarization for Long Recordings and Small Collections. 404-413 - Oshry Ben-Harush, Itshak Lapidot, Hugo Guterman:
Initialization of Iterative-Based Speaker Diarization Systems for Telephone Conversations. 414-425 - José Manuel Pardo, Roberto Barra-Chicote, Rubén San Segundo, Ricardo de Córdoba, Beatriz Martínez-González:
Speaker Diarization Features: The UPM Contribution to the RT09 Evaluation. 426-435 - Martin Zelenák, Carlos Segura, Jordi Luque, Javier Hernando:
Simultaneous Speech Detection With Spatial Features for Speaker Diarization. 436-446 - Katsuhiko Ishiguro, Takeshi Yamada, Shoko Araki, Tomohiro Nakatani, Hiroshi Sawada:
Probabilistic Speaker Diarization With Bag-of-Words Representations of Speaker Angle Information. 447-460 - Tin Lay Nwe, Hanwu Sun, Bin Ma, Haizhou Li:
Speaker Clustering and Cluster Purification Methods for RT07 and RT09 Evaluation Meeting Data. 461-473 - Fernando Batista, Helena Moniz, Isabel Trancoso, Nuno J. Mamede:
Bilingual Experiments on Automatic Recovery of Capitalization and Punctuation of Automatic Speech Transcripts. 474-485 - Thomas Hain, Lukás Burget, John Dines, Philip N. Garner, Frantisek Grézl, Asmaa El Hannani, Marijn Huijbregts, Martin Karafiát, Mike Lincoln, Vincent Wan:
Transcribing Meetings With the AMIDA Systems. 486-498 - Takaaki Hori, Shoko Araki, Takuya Yoshioka, Masakiyo Fujimoto, Shinji Watanabe, Takanobu Oba, Atsunori Ogawa, Kazuhiro Otsuka, Dan Mikami, Keisuke Kinoshita, Tomohiro Nakatani, Atsushi Nakamura, Junji Yamato:
Low-Latency Real-Time Meeting Recognition and Understanding Using Distant Microphones and Omni-Directional Camera. 499-513 - Joan Serrà, Holger Kantz, Xavier Serra, Ralph G. Andrzejak:
Predictability of Music Descriptor Time Series and its Application to Cover Song Detection. 514-525 - Marco Dinarelli, Alessandro Moschitti, Giuseppe Riccardi:
Discriminative Reranking for Spoken Language Understanding. 526-539 - Ebru Arisoy, Murat Saraclar, Brian Roark, Izhak Shafran:
Discriminative Language Modeling With Linguistic and Statistically Derived Features. 540-550 - Björn Hoffmeister, Georg Heigold, David Rybach, Ralf Schlüter, Hermann Ney:
WFST Enabled Solutions to ASR Problems: Beyond HMM Decoding. 551-564 - Alberto Sanchís, Alfons Juan, Enrique Vidal:
A Word-Based Naïve Bayes Classifier for Confidence Estimation in Speech Recognition. 565-574 - Wen Zhang, Mengqiu Zhang, Rodney A. Kennedy, Thushara D. Abhayapala:
On High-Resolution Head-Related Transfer Function Measurements: An Efficient Sampling Scheme. 575-584 - Sungrack Yun, Chang D. Yoo:
Loss-Scaled Large-Margin Gaussian Mixture Models for Speech Emotion Classification. 585-598 - Nima Yousefian, Philipos C. Loizou:
A Dual-Microphone Speech Enhancement Algorithm Based on the Coherence Function. 599-609 - Laura E. Boucheron, Phillip L. De Leon, Steven Sandoval:
Low Bit-Rate Speech Coding Through Quantization of Mel-Frequency Cepstral Coefficients. 610-619 - Nam Soo Kim, Tae Gyoon Kang, Shin Jae Kang, Chang Woo Han, Doo Hwa Hong:
Speech Feature Mapping Based on Switching Linear Dynamic System. 620-631 - Yi-Cheng Pan, Hung-yi Lee, Lin-Shan Lee:
Interactive Spoken Document Retrieval With Suggested Key Terms Ranked by a Markov Decision Process. 632-645 - Jake Gunther:
Learning Echo Paths During Continuous Double-Talk Using Semi-Blind Source Separation. 646-660 - Meng Yu, Wenye Ma, Jack Xin, Stanley J. Osher:
Multi-Channel l1 Regularized Convex Speech Enhancement Model and Fast Computation by the Split Bregman Method. 661-675 - Hüseyin Hacihabiboglu, Zoran Cvetkovic:
Multichannel Dereverberation Theorems and Robustness Issues. 676-689 - Laura Romoli, Stefania Cecchi, Paolo Peretti, Francesco Piazza:
A Mixed Decorrelation Approach for Stereo Acoustic Echo Cancellation Based on the Estimation of the Fundamental Frequency. 690-698 - Jacob Benesty, Mehrez Souden, Yiteng Huang:
A Perspective on Differential Microphone Arrays in the Context of Noise Reduction. 699-704 - Frédéric Mustière, Martin Bouchard, Miodrag Bolic:
All-Pole Modeling of Discrete Spectral Powers: A Unified Approach. 705-708 - Takayuki Arai, Nao Hodoshima, Keiichi Yasu:
Errata to "Using Steady-State Suppression to Improve Speech Intelligibility in Reverberant Environments for Elderly Listeners". 709
Volume 20, Number 3, March 2012
- Kazuyoshi Yoshii, Masataka Goto:
A Nonparametric Bayesian Multipitch Analyzer Based on Infinite Latent Harmonic Allocation. 717-730 - Siddika Parlak, Murat Saraclar:
Performance Analysis and Improvement of Turkish Broadcast News Retrieval. 731-741 - Haohai Sun, Shefeng Yan, U. Peter Svensson:
Optimal Higher Order Ambisonics Encoding With Predefined Constraints. 742-754 - Mitchell McLaren, David A. van Leeuwen:
Source-Normalized LDA for Robust Speaker Recognition Using i-Vectors From Multiple Speech Sources. 755-766 - Elias K. Kokkinis, Joshua D. Reiss, John Mourjopoulos:
A Wiener Filter Approach to Microphone Leakage Reduction in Close-Microphone Applications. 767-779 - Qiang Fu, Yong Zhao, Biing-Hwang Juang:
Automatic Speech Recognition Based on Non-Uniform Error Criteria. 780-793 - Heiga Zen, Mark J. F. Gales, Yoshihiko Nankaku, Keiichi Tokuda:
Product of Experts for Statistical Parametric Speech Synthesis. 794-805 - Elina Helander, Hanna Silén, Tuomas Virtanen, Moncef Gabbouj:
Voice Conversion Using Dynamic Kernel Partial Least Squares Regression. 806-817 - Ning Ma, Jon Barker, Heidi Christensen, Phil D. Green:
Combining Speech Fragment Decoding and Adaptive Noise Floor Modeling. 818-827 - Liang-Che Sun, Lin-Shan Lee:
Modulation Spectrum Equalization for Improved Robust Speech Recognition. 828-843 - Matija Marolt:
Automatic Transcription of Bell Chiming Recordings. 844-853 - Emanuël Anco Peter Habets, Jacob Benesty, Patrick A. Naylor:
A Speech Distortion and Interference Rejection Constraint Beamformer. 854-867 - Yousheng Chen, Qin Gong:
A Normalized Beamforming Algorithm for Broadband Speech Using a Continuous Interleaved Sampling Strategy. 868-874 - Sabato Marco Siniscalchi, Dau-Cheng Lyu, Torbjørn Svendsen, Chin-Hui Lee:
Experiments on Cross-Language Attribute Detection and Phone Recognition With Minimal Target-Specific Training Data. 875-887 - Xiang Lin, Andy W. H. Khong, Patrick A. Naylor:
A Forced Spectral Diversity Algorithm for Speech Dereverberation in the Presence of Near-Common Zeros. 888-899 - Yu-Hsiang Bosco Chiu, Bhiksha Raj, Richard M. Stern:
Learning-Based Auditory Encoding for Robust Speech Recognition. 900-914 - Amir Adler, Valentin Emiya, Maria G. Jafari, Michael Elad, Rémi Gribonval, Mark D. Plumbley:
Audio Inpainting. 922-932 - Ana M. Barbancho, Anssi Klapuri, Lorenzo J. Tardón, Isabel Barbancho:
Automatic Transcription of Guitar Chords and Fingering From Audio. 915-921 - Wei Chu, Abeer Alwan:
SAFE: A Statistical Approach to F0 Estimation Under Clean and Noisy Conditions. 933-944 - Ashish Panda, Thambipillai Srikanthan:
Psychoacoustic Model Compensation for Robust Speaker Verification in Environmental Noise. 945-953 - Emanuël A. P. Habets, Jacob Benesty:
A Perspective on Frequency-Domain Beamformers in Room Acoustics. 947-960 - Thomas Drugman, Thierry Dutoit:
The Deterministic Plus Stochastic Model of the Residual Signal and Its Applications. 968-981 - Shing-Chow Chan, Y. Chu:
Performance Analysis and Design of FxLMS Algorithm in Broadband ANC System With Online Secondary-Path Modeling. 982-993 - Thomas Drugman, Mark R. P. Thomas, Jón Guðnason, Patrick A. Naylor, Thierry Dutoit:
Detection of Glottal Closure Instants From Speech Signals: A Quantitative Review. 994-1006 - Alfonso Pérez Carrillo, Jordi Bonada, Esteban Maestre, Enric Guaus, Merlijn Blaauw:
Performance Control Driven Violin Timbre Model Based on Neural Networks. 1007-1021 - Ravi K. Chivukula, Yuriy A. Reznik, Venkat Devarajan, Mythreya Jayendra-Lakshman:
Fast Algorithms for Low-Delay SBR Filterbanks in MPEG-4 AAC-ELD. 1022-1031 - Xianyu Zhao, Yuan Dong:
Variational Bayesian Joint Factor Analysis Models for Speaker Verification. 1032-1042 - Ashutosh Pandey, V. John Mathews:
Adaptive Gain Processing With Offending Frequency Suppression for Digital Hearing Aids. 1043-1055 - Tamar Shoham, David Malah, Slava Shechtman:
Quality Preserving Compression of a Concatenative Text-To-Speech Acoustic Database. 1056-1068 - Vladimir Despotovic, Norbert Goertz, Zoran Peric:
Nonlinear Long-Term Prediction of Speech Based on Truncated Volterra Series. 1069-1073 - Siow Yong Low, Svetha Venkatesh, Sven Nordholm:
A Spectral Slit Approach to Doubletalk Detection. 1074-1080
Volume 20, Number 4, May 2012
- Seiichi Nakagawa, Longbiao Wang, Shinji Ohtsuka:
Speaker Identification and Verification by Combining MFCC and Phase Information. 1085-1095 - Riccardo Miotto, Gert R. G. Lanckriet:
A Generative Context Model for Semantic Music Annotation and Retrieval. 1096-1108 - Chu-Cheng Lin, Richard Tzong-Han Tsai:
A Generative Data Augmentation Model for Enhancing Chinese Dialect Pronunciation Prediction. 1109-1117 - Alexey Ozerov, Emmanuel Vincent, Frédéric Bimbot:
A General Flexible Framework for the Handling of Prior Information in Audio Source Separation. 1118-1133 - Jia-Min Ren, Jyh-Shing Roger Jang:
Discovering Time-Constrained Sequential Patterns for Music Genre Classification. 1134-1144 - Virginia Estellers, Mihai Gurban, Jean-Philippe Thiran:
On Dynamic Stream Weighting for Audio-Visual Speech Recognition. 1145-1157 - Navin Chatlani, John J. Soraghan:
EMD-Based Filtering (EMDF) of Low-Frequency Noise for Speech Enhancement. 1158-1166 - Haiyan Shu, Haibin Huang, Susanto Rahardja:
Analysis of Bit-Plane Probability for Generalized Gaussian Distribution and its Application in Audio Coding. 1167-1176 - Tobias Rosenkranz, Henning Puder:
Improving Robustness of Codebook-Based Noise Estimation Approaches With Delta Codebooks. 1177-1188 - Ines Hafizovic, Carl-Inge Colombo Nilsen, Sverre Holm:
Transformation Between Uniform Linear and Spherical Microphone Arrays With Symmetric Responses. 1189-1195 - Xiaohong Yang, Yufang Yang:
Prosodic Realization of Rhetorical Structure in Chinese Discourse. 1196-1206 - David T. Yeh:
Automated Physical Modeling of Nonlinear Audio Circuits for Real-Time Audio Effects - Part II: BJT and Vacuum Tube Examples. 1207-1216 - Manish Narwaria, Weisi Lin, Ian Vince McLoughlin, Sabu Emmanuel, Liang-Tien Chia:
Nonintrusive Quality Assessment of Noise Suppressed Speech With Mel-Filtered Energies and Support Vector Regression. 1217-1232 - Wei-Ho Tsai, Hsin-Chieh Lee:
Automatic Evaluation of Karaoke Singing Based on Pitch, Volume, and Rhythm Features. 1233-1243 - Takanobu Oba, Takaaki Hori, Atsushi Nakamura, Akinori Ito:
Round-Robin Duel Discriminative Language Models. 1244-1255 - Yiteng Arden Huang, Jacob Benesty:
A Multi-Frame Approach to the Frequency-Domain Single-Channel Noise Reduction Problem. 1256-1269 - Miroslav Zivanovic, Johan Schoukens:
Single and Piecewise Polynomials for Modeling of Pitched Sounds. 1270-1281 - Yaakov Bucris, Israel Cohen, Miriam A. Doron:
Bayesian Focusing for Coherent Wideband Beamforming. 1282-1296 - Hélène Papadopoulos, Geoffroy Peeters:
Local Key Estimation From an Audio Signal Relying on Harmonic and Metrical Structures. 1297-1312 - Elizabeth Godoy, Olivier Rosec, Thierry Chonavel:
Voice Conversion Using Dynamic Frequency Warping With Amplitude Scaling, for Parallel or Nonparallel Corpora. 1313-1323 - Ruofei Chen, Cheung-Fat Chan, Hing-Cheung So:
Model-Based Speech Enhancement With Improved Spectral Envelope Estimation via Dynamics Tracking. 1324-1336 - Qun Feng Tan, Shrikanth S. Narayanan:
Novel Variations of Group Sparse Regularization Techniques With Applications to Noise Robust Automatic Speech Recognition. 1337-1346 - Rubén Solera-Ureña, Ana I. García-Moral, Carmen Peláez-Moreno, Manel Martínez-Ramón, Fernando Díaz-de-María:
Real-Time Robust Automatic Speech Recognition Using Compact Support Vector Machines. 1347-1361 - Amin Fazel, Shantanu Chakrabartty:
Sparse Auditory Reproducing Kernel (SPARK) Features for Noise-Robust Speech Recognition. 1362-1371 - Jorge I. Marin-Hurtado, Devangi N. Parikh, David V. Anderson:
Perceptually Inspired Noise-Reduction Method for Binaural Hearing Aids. 1372-1382 - Timo Gerkmann, Richard C. Hendriks:
Unbiased MMSE-Based Noise Power Estimation With Low Complexity and Low Tracking Delay. 1383-1393 - Haiquan Zhao, Xiangping Zeng, Xiaoqiang Zhang, Zhengyou He, Tianrui Li, Weidong Jin:
Adaptive Extended Pipelined Second-Order Volterra Filter for Nonlinear Active Noise Controller. 1394-1399 - Damián Marelli, Mitsuko Aramaki, Richard Kronland-Martinet, Charles Verron:
An Efficient Time-Frequency Method for Synthesizing Noisy Sounds With Short Transients and Narrow Spectral Components. 1400-1408 - Maurice F. Fallon, Simon J. Godsill:
Acoustic Source Localization and Tracking of a Time-Varying Number of Speakers. 1409-1415
Volume 20, Number 5, July 2012
- Vesa Välimäki, Julian D. Parker, Lauri Savioja, Julius O. Smith III, Jonathan S. Abel:
Fifty Years of Artificial Reverberation. 1421-1448 - Flavio P. Ribeiro, Dinei A. F. Florêncio, Demba E. Ba, Cha Zhang:
Geometrically Constrained Room Modeling With Compact Microphone Arrays. 1449-1460 - Wenliang Chen, Jun'ichi Kazama, Min Zhang, Yoshimasa Tsuruoka, Yujie Zhang, Yiou Wang, Kentaro Torisawa, Haizhou Li:
Bitext Dependency Parsing With Auto-Generated Bilingual Treebank. 1461-1472 - K. Lakhdhar, Roch Lefebvre:
Context-Based Adaptive Arithmetic Encoding of EAVQ Indices. 1473-1481 - Chao-Ling Hsu, DeLiang Wang, Jyh-Shing Roger Jang, Ke Hu:
A Tandem Algorithm for Singing Pitch Extraction and Voice Separation From Music Accompaniment. 1482-1491 - Zhen-Hua Ling, Li-Rong Dai:
Minimum Kullback-Leibler Divergence Parameter Generation for HMM-Based Speech Synthesis. 1492-1502 - John Woodruff, DeLiang Wang:
Binaural Localization of Multiple Sources in Reverberant and Noisy Environments. 1503-1512 - Welly Naptali, Masatoshi Tsuchiya, Seiichi Nakagawa:
Topic-Dependent-Class-Based $n$-Gram Language Model. 1513-1525 - Jesper Rindom Jensen, Jacob Benesty, Mads Græsbøll Christensen, Søren Holdt Jensen:
Non-Causal Time-Domain Filters for Single-Channel Noise Reduction. 1526-1541 - Kamil Adiloglu, Robert Anniés, Elio Wahlen, Hendrik Purwins, Klaus Obermayer:
A Graphical Representation and Dissimilarity Measure for Basic Everyday Sound Events. 1542-1552 - Cees H. Taal, Richard C. Hendriks, Richard Heusdens:
A Low-Complexity Spectro-Temporal Distortion Measure for Audio Processing Applications. 1553-1564 - Huawei Chen, Wee Ser, Jianjiang Zhou:
Robust Nearfield Wideband Beamformer Design Using Worst Case Mean Performance Optimization With Passband Response Variance Constraint. 1565-1572 - D. Rama Sanand, Srinivasan Umesh:
VTLN Using Analytically Determined Linear-Transformation on Conventional MFCC. 1573-1584 - Sandro Cumani, Pietro Laface:
Analysis of Large-Scale SVM Training Algorithms for Language and Speaker Recognition. 1585-1596 - Xiaoyan Cai, Wenjie Li:
Mutually Reinforced Manifold-Ranking Based Relevance Propagation Model for Query-Focused Multi-Document Summarization. 1597-1607 - Xiaojia Zhao, Yang Shao, DeLiang Wang:
CASA-Based Robust Speaker Identification. 1608-1616 - Saeed Mosayyebpour, Hamid Sheikhzadeh, T. Aaron Gulliver, Morteza Esmaeili:
Single-Microphone LP Residual Skewness-Based Inverse Filtering of the Room Impulse Response. 1617-1632 - Upendra V. Chaudhari, Michael Picheny:
Matching Criteria for Vocabulary-Independent Search. 1633-1643 - Daniele Giacobello, Mads Græsbøll Christensen, Manohar N. Murthi, Søren Holdt Jensen, Marc Moonen:
Sparse Linear Prediction and Its Applications to Speech Processing. 1644-1657 - Stefan Bilbao:
Optimized FDTD Schemes for 3-D Acoustic Wave Propagation. 1658-1663
Volume 20, Number 6, August 2012
- Sin-Horng Chen, Jyh-Her Yang, Chen-Yu Chiang, Ming-Chieh Liu, Yih-Ru Wang:
A New Prosody-Assisted Mandarin ASR System. 1669-1684 - Romain Serizel, Marc Moonen, Jan Wouters, Søren Holdt Jensen:
A Zone-of-Quiet Based Approach to Integrated Active Noise Control and Noise Reduction for Speech Enhancement in Hearing Aids. 1685-1697 - Christian D. Sigg, Tomas Dikk, Joachim M. Buhmann:
Speech Enhancement Using Generative Dictionary Learning. 1698-1712 - Heiga Zen, Norbert Braunschweiler, Sabine Buchholz, Mark J. F. Gales, Kate M. Knill, Sacha Krstulovic, Javier Latorre:
Statistical Parametric Speech Synthesis Based on Speaker and Language Factorization. 1713-1724 - Christian Schüldt, Fredric Lindström, Ingvar Claesson:
A Delay-Based Double-Talk Detector. 1725-1733 - Alastair J. Manders, David M. Simpson, Steven L. Bell:
Objective Prediction of the Sound Quality of Music Processed by an Adaptive Feedback Canceller. 1734-1745 - Shoichi Koyama, Ken'ichi Furuya, Yusuke Hiwasaki, Yoichi Haneda:
Reproducing Virtual Sound Sources in Front of a Loudspeaker Array Using Inverse Wave Propagator. 1746-1758 - Justin Salamon, Emilia Gómez:
Melody Extraction From Polyphonic Music Signals Using Pitch Contour Characteristics. 1759-1770 - Yizhao Ni, Matt McVicar, Raúl Santos-Rodriguez, Tijl De Bie:
An End-to-End Machine Learning System for Harmonic Analysis of Music. 1771-1783 - Daisuke Saito, Shinji Watanabe, Atsushi Nakamura, Nobuaki Minematsu:
Statistical Voice Conversion Based on Noisy Channel Model. 1784-1794 - Daniel Angus, Andrew E. Smith, Janet Wiles:
Human Communication as Coupled Time Series: Quantifying Multi-Participant Recurrence. 1795-1807 - Claire Masterson, Gavin Kearney, Marcin Gorzel, Francis M. Boland:
HRIR Order Reduction Using Approximate Factorization. 1808-1817 - Jan Vanek, Jan Trmal, Josef V. Psutka, Josef Psutka:
Optimized Acoustic Likelihoods Computation for NVIDIA and ATI/AMD Graphics Processors. 1818-1828 - Jan Ole Jungmann, Radoslaw Mazur, Markus Kallinger, Tiemin Mei, Alfred Mertins:
Combined Acoustic MIMO Channel Crosstalk Cancellation and Room Impulse Response Reshaping. 1829-1842 - Tianyu T. Wang, Thomas F. Quatieri:
Two-Dimensional Speech-Signal Modeling. 1843-1856 - Isabel Barbancho, Lorenzo J. Tardón, Simone Sammartino, Ana M. Barbancho:
Inharmonicity-Based Method for the Automatic Generation of Guitar Tablature. 1857-1868 - Amit Das, John H. L. Hansen:
Constrained Iterative Speech Enhancement Using Phonetic Classes. 1869-1883 - Abbas Keshavarz, Saeed Mosayyebpour, Mehrzad Biguesh, T. Aaron Gulliver, Morteza Esmaeili:
Speech-Model Based Accurate Blind Reverberation Time Estimation Using an LPC Filter. 1884-1893 - Anil Kumar Vuppala, Jainath Yadav, Saswat Chakrabarti, K. Sreenivasa Rao:
Vowel Onset Point Detection for Low Bit Rate Coded Speech. 1894-1903
Volume 20, Number 7, 2012
- Theodoros Giannakopoulos, Sergios Petridis:
Fisher Linear Semi-Discriminant Analysis for Speaker Diarization. 1913-1922 - Xiaodong Cui, Jing Huang, Jen-Tzung Chien:
Multi-View and Multi-Objective Semi-Supervised Learning for HMM-Based Automatic Speech Recognition. 1923-1935 - Jacob L. Newman, Stephen J. Cox:
Language Identification Using Visual Features. 1936-1947 - Jesper Rindom Jensen, Jacob Benesty, Mads Græsbøll Christensen, Søren Holdt Jensen:
Enhancement of Single-Channel Periodic Signals in the Time-Domain. 1948-1963 - Marco Compagnoni, Paolo Bestagini, Fabio Antonacci, Augusto Sarti, Stefano Tubaro:
Localization of Acoustic Sources Through the Fitting of Propagation Cones Using Multiple Independent Arrays. 1964-1975 - Jung-Woo Choi, Yang-Hann Kim:
Integral Approach for Reproduction of Virtual Sound Source Surrounded by Loudspeaker Array. 1976-1989 - Tomi Kinnunen, Rahim Saeidi, Filip Sedlak, Kong-Aik Lee, Johan Sandberg, Maria Hansson-Sandsten, Haizhou Li:
Low-Variance Multitaper MFCC Features: A Case Study in Robust Speaker Verification. 1990-2001 - Wen-Lin Zhang, Weiqiang Zhang, Bi-Cheng Li, Dan Qu, Michael T. Johnson:
Bayesian Speaker Adaptation Based on a New Hierarchical Probabilistic Model. 2002-2015 - Tobias May, Steven van de Par, Armin Kohlrausch:
A Binaural Scene Analyzer for Joint Localization and Recognition of Speakers in the Presence of Interfering Noise Sources and Reverberation. 2016-2030 - Armando Muscariello, Guillaume Gravier, Frédéric Bimbot:
Unsupervised Motif Acquisition in Speech via Seeded Discovery and Template Matching Combination. 2031-2044 - César González Ferreras, David Escudero Mancebo, Carlos Vivaracho-Pascual, Valentín Cardeñoso-Payo:
Improving Automatic Classification of Prosodic Events by Pairwise Coupling. 2045-2058 - Maximo Cobos, José J. López:
Maximum a Posteriori Binary Mask Estimation for Underdetermined Source Separation Using Smoothed Posteriors. 2059-2064 - Sarmad Malik, Gerald Enzner:
State-Space Frequency-Domain Adaptive Filtering for Nonlinear Acoustic Echo Cancellation. 2065-2079 - Ryoichi Miyazaki, Hiroshi Saruwatari, Takayuki Inoue, Yu Takahashi, Kiyohiro Shikano, Kazunobu Kondo:
Musical-Noise-Free Speech Enhancement Based on Optimized Iterative Spectral Subtraction. 2080-2094 - Hung-yi Lee, Chia-Ping Chen, Lin-Shan Lee:
Integrating Recognition and Retrieval With Relevance Feedback for Spoken Term Detection. 2095-2110 - Mohamed I. Alkanhal, Mohamed Al-Badrashiny, Mansour M. Alghamdi, Abdulaziz O. Al-Qabbany:
Automatic Stochastic Arabic Spelling Correction With Emphasis on Space Insertions and Deletions. 2111-2122 - Stephen J. Elliott, Jordan Cheer, Jung-Woo Choi, Youngtae Kim:
Robustness and Regularization of Personal Audio Systems. 2123-2133 - Lakshmi Babu Saheer, John Dines, Philip N. Garner:
Vocal Tract Length Normalization for Statistical Parametric Speech Synthesis. 2134-2148 - Yongqiang Wang, Mark J. F. Gales:
Speaker and Noise Factorization for Robust Speech Recognition. 2149-2158
Volume 20, Number 8, October 2012
- Leonardo O. Nunes, Flávio R. Avila, Alan Freihof Tygel, Luiz W. P. Biscainho, Bowon Lee, Amir Said, Ronald W. Schafer:
A Parametric Objective Quality Assessment Tool for Speech Signals Degraded by Acoustic Echo. 2181-2190 - Yong Zhao, Biing-Hwang Juang:
Nonlinear Compensation Using the Gauss-Newton Method for Noise-Robust Speech Recognition. 2191-2206 - Brian McFee, Luke Barrington, Gert R. G. Lanckriet:
Learning Content Similarity for Music Recommendation. 2207-2218 - Hannu Pulakka, Ulpu Remes, Santeri Yrttiaho, Kalle J. Palomäki, Mikko Kurimo, Paavo Alku:
Bandwidth Extension of Telephone Speech to Low Frequencies Using Sinusoidal Synthesis and a Gaussian Mixture Model. 2219-2231 - Iynkaran Natgunanathan, Yong Xiang, Yue Rong, Wanlei Zhou, Song Guo:
Robust Patchwork-Based Embedding and Decoding Scheme for Digital Audio Watermarking. 2232-2239 - Yotaro Kubo, Shinji Watanabe, Takaaki Hori, Atsushi Nakamura:
Structural Classification Methods Based on Weighted Finite-State Transducers for Automatic Speech Recognition. 2240-2251 - Xiaodong Cui, Jian Xue, Xin Chen, Peder A. Olsen, Pierre L. Dognin, Upendra V. Chaudhari, John R. Hershey, Bowen Zhou:
Hidden Markov Acoustic Modeling With Bootstrap and Restructuring for Low-Resourced Languages. 2252-2264 - Amit Das, John H. L. Hansen:
Phoneme Selective Speech Enhancement Using Parametric Estimators and the Mixture Maximum Model: A Unifying Approach. 2265-2279 - Phillip L. De Leon, Michael Pucher, Junichi Yamagishi, Inma Hernáez, Ibon Saratxaga:
Evaluation of Speaker Verification Security and Detection of HMM-Based Synthetic Speech. 2280-2290 - Wei-Ho Tsai, Hsin-Chieh Lee:
Singer Identification Based on Spoken Data in Voice Characterization. 2291-2300 - Daniel Felps, Christian Geng, Ricardo Gutierrez-Osuna:
Foreign Accent Conversion Through Concatenative Synthesis in the Articulatory Domain. 2301-2312 - Gustavo Reis, Francisco Fernández de Vega, Aníbal J. S. Ferreira:
Automatic Transcription of Polyphonic Piano Music Using Genetic Algorithms, Adaptive Spectral Envelope Modeling, and Dynamic Noise Level Estimation. 2313-2328 - Soroosh Mariooryad, Carlos Busso:
Generating Human-Like Behaviors Using Joint, Speech-Driven Models for Conversational Agents. 2329-2340 - Hasim Sak, Murat Saraclar, Tunga Gungor:
Morpholexical and Discriminative Language Models for Turkish Automatic Speech Recognition. 2341-2351 - Yongwon Jeong:
Adaptation of Hidden Markov Models Using Model-as-Matrix Representation. 2352-2364 - Seyedmahdad Mirsamadi, Shabnam Ghaffarzadegan, Hamid Sheikhzadeh, Seyed Mohammad Ahadi, Amir Hossein Rezaie:
Efficient Frequency Domain Implementation of Noncausal Multichannel Blind Deconvolution for Convolutive Mixtures of Speech. 2365-2377 - Barry-John Theobald, Iain A. Matthews:
Relating Objective and Subjective Performance Measures for AAM-Based Visual Speech Synthesis. 2378-2387 - Terence Betlehem, Christopher S. Withers:
Sound Field Reproduction With Energy Constraint on Loudspeaker Weights. 2388-2392
Volume 20, Number 9, November 2012
- Nikolaos Mitianoudis:
A Generalized Directional Laplacian Distribution : Estimation, Mixture Models and Audio Source Separation. 2397-2408 - Janne Pylkkönen, Mikko Kurimo:
Analysis of Extended Baum-Welch and Constrained Optimization for Discriminative Training of HMMs. 2409-2419 - Alex Southern, Damian T. Murphy, Lauri Savioja:
Spatial Encoding of Finite Difference Time Domain Acoustic Models for Auralization. 2420-2432 - Marco Crocco, Andrea Trucco:
Stochastic and Analytic Optimization of Sparse Aperiodic Arrays and Broadband Beamformers With Robust Superdirective Patterns. 2433-2447 - Masashi Okada, Takao Onoye, Wataru Kobayashi:
A Ray Tracing Simulation of Sound Diffraction Based on the Analytic Secondary Source Model. 2448-2460 - Ryouichi Nishimura:
Audio Watermarking Using Spatial Masking and Ambisonics. 2461-2469 - Flávio R. Avila, Luiz W. P. Biscainho:
Bayesian Restoration of Audio Signals Degraded by Impulsive Noise Modeled as Individual Pulses. 2470-2481 - Woojay Jeon, Changxue Ma, Dusan Macho:
Statistical Utterance Comparison for Speaker Clustering Using Factor Analysis. 2482-2491 - Justin Jian Zhang, Pascale Fung:
Automatic Parliamentary Meeting Minute Generation Using Rhetorical Structure Modeling. 2492-2504 - Tomoki Toda, Mikihiro Nakagiri, Kiyohiro Shikano:
Statistical Voice Conversion Techniques for Body-Conducted Unvoiced Speech Enhancement. 2505-2517 - Arun Narayanan, DeLiang Wang:
A CASA-Based System for Long-Term SNR Estimation. 2518-2527 - Ronen Talmon, Israel Cohen, Sharon Gannot, Ronald R. Coifman:
Supervised Graph-Based Processing for Sequential Transient Interference Suppression. 2528-2538 - Andre Holzapfel, Matthew E. P. Davies, José Ricardo Zapata, João Lobato Oliveira, Fabien Gouyon:
Selective Sampling for Beat Tracking Evaluation. 2539-2548 - Meng Guo, Søren Holdt Jensen, Jesper Jensen:
Novel Acoustic Feedback Cancellation Approaches in Hearing Aid Applications Using Probe Noise and Probe Noise Enhancement. 2549-2563 - Jens Ahrens, Sascha Spors:
A Modal Analysis of Spatial Discretization of Spherical Loudspeaker Distributions Used for Sound Field Synthesis. 2564-2574 - Vladimir Tourbabin, Morag Agmon, Boaz Rafaely, Joseph Tabrikian:
Optimal Real-Weighted Beamforming With Application to Linear and Spherical Arrays. 2575-2585 - Pejman Mowlaee, Rahim Saeidi, Mads Græsbøll Christensen, Zheng-Hua Tan, Tomi Kinnunen, Pasi Fränti, Søren Holdt Jensen:
A Joint Approach for Single-Channel Speaker Identification and Speech Separation. 2586-2601 - Berlin Chen, Kuan-Yu Chen, Pei-Ning Chen, Yi-Wen Chen:
Spoken Document Retrieval With Unsupervised Query Modeling Techniques. 2602-2612 - Kruthiventi S. S. Srinivas, Kishore Prahallad:
An FIR Implementation of Zero Frequency Filtering of Speech Signals. 2613-2617
Volume 20, Number 10, December 2012
- Mari Ostendorf:
A Message from the Vice President of Publications on New Developments in Signal Processing Society Publications. 2625 - Sundar Harshavardhan, Chandra Sekhar Seelamantula, Thippur V. Sreenivas:
A Mixture Model Approach for Formant Tracking and the Robustness of Student's-t Distribution. 2626-2636 - Steven Hargreaves, Anssi Klapuri, Mark B. Sandler:
Structural Segmentation of Multitrack Audio. 2637-2647 - Matthew Gibson, Thomas Hain:
Correctness-Adjusted Unsupervised Discriminative Acoustic Model Adaptation. 2648-2656 - Bruno Defraene, Toon van Waterschoot, Hans Joachim Ferreau, Moritz Diehl, Marc Moonen:
Real-Time Perception-Based Clipping of Audio Signals Using Convex Optimization. 2657-2671 - Gopal Ananthakrishnan, Olov Engwall, Daniel Neiberg:
Exploring the Predictability of Non-Unique Acoustic-to-Articulatory Mappings. 2672-2682 - Fabio Antonacci, Jason Filos, Mark R. P. Thomas, Emanuël Anco Peter Habets, Augusto Sarti, Patrick A. Naylor, Stefano Tubaro:
Inference of Room Geometry From Acoustic Impulse Responses. 2683-2695 - João Lobato Oliveira, Matthew E. P. Davies, Fabien Gouyon, Luís Paulo Reis:
Beat Tracking for Multiple Applications: A Multi-Agent System Architecture With State Recovery. 2696-2706 - Takuya Yoshioka, Tomohiro Nakatani:
Generalization of Multi-Channel Linear Prediction Methods for Blind MIMO Impulse Response Shortening. 2707-2720
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.