default search action
EUROSPEECH 1999: Budapest, Hungary
- Sixth European Conference on Speech Communication and Technology, EUROSPEECH 1999, Budapest, Hungary, September 5-9, 1999. ISCA 1999
Keynotes
- Frederick Jelinek, Ciprian Chelba:
Putting language into language modeling. - Mária Gósy:
The controversial connection between speech production and perception: theories vs. facts. - Mark T. Maybury:
Multimedia interaction for the new millennium. - Björn Lindblom:
How speech works - questions and preliminary answers.
Speech Recognition, Adaptation 1
- Wu Chou:
Maximum a posterior linear regression with elliptically symmetric matrix variate priors. 1-4 - Silke Goronzy, Ralf Kompe:
A MAP-like weighting scheme for MLLR speaker adaptation. 5-8 - Hans-Günter Hirsch:
HMM adaptation for telephone applications. 9-12 - Jing Huang, Mukund Padmanabhan:
A study of adaptation techniques on a voicemail transcription task. 13-16 - Beth Logan:
Maximum likelihood sequential adaptation. 17-20
Prosody - Prosodic Features in Dialogues
- Caren Brinckmann, Ralf Benzmüller:
The relationship between utterance type and F0 contour in German. 21-24 - Evelina Grigorova, Vladimir Filipov, Bistra Andreeva:
A contrastive investigation of discourse intonational characteristic features of sofia bulgarian and hamburg German in MAP task dialogues. 25-28 - Merle Horne, Petra Hansson, Gösta Bruce, Johan Frid:
Prosodic correlates of information structure in Swedish human-human dialogues. 29-32 - Shigeyoshi Kitazawa, S. Kobayashi:
Paralinguistic features as suprasegmental acoustics observed in natural Japanese dialogue. 33-36 - Masafumi Tamoto, Masahito Kawamori, Takeshi Kawabata:
Integrating prosodic features in dialogue understanding. 37-40
Speech Recognition - Confidence Measures
- Stephen Cox, Srinandan Dasmahapatra:
A high-level approach to confidence estimation in speech recognition. 41-44 - Bin Jia, Xiaoyan Zhu, Yupin Luo, Dongcheng Hu:
Utterance verification using modified segmental probability model. 45-48 - Dietrich Klakow, Georg Rose, Xavier L. Aubert:
OOV-detection in large vocabulary system using automatically defined word-fragments as fillers. 49-52 - Qiguang Lin, David M. Lubensky, Salim Roukos:
Use of recursive mumble models for confidence measuring. 53-56 - Mazin G. Rahim:
Utterance verification for the numeric language in a natural spoken dialogue. 57-60
Speech Recognition - Acoustic Processing
- Rathinavelu Chengalvarayan:
Robust energy normalization using speech/nonspeech discriminator for German connected digit recognition. 61-64 - Johan de Veth, Bert Cranen, Febe de Wet, Lou Boves:
Acoustic pre-processing for optimal effectivity of missing feature theory. 65-68 - Panikos Heracleous, Takeshi Yamada, Satoshi Nakamura, Kiyohiro Shikano:
Simultaneous recognition of multiple sound sources based on 3-d n-best search using microphone array. 69-72 - Hynek Hermansky, Pratibha Jain:
Down-sampling speech representation in ASR. 73-76 - Dusan Macho, Climent Nadeu, Peter Jancovic, Gregor Rozinaj, Javier Hernando:
Comparison of time & frequency filtering and cepstral-time matrix approaches in ASR. 77-80 - Hugo Meinedo, João Paulo Neto, Luís B. Almeida:
Syllable onset detection applied to the portuguese language. 81-84 - Kuldip K. Paliwal:
Decorrelated and liftered filter-bank energies for robust speech recognition. 85-88 - Pau Pachès-Leal, Richard C. Rose, Climent Nadeu:
Optimization algorithms for estimating modulation spectrum domain filters. 89-92 - Rubén San Segundo, Ricardo de Córdoba, Javier Ferreiros, Ascensión Gallardo-Antolín, José Colás, Julio Pastor, Y. López:
Efficient vector quantization using an n-path binary tree search algorithm. 93-96 - Narada D. Warakagoda, Magne Hallstein Johnsen:
Neural network based optimal feature extraction for ASR. 97-100 - Fumihiro Yato, Naomi Inoue, Kazuo Hashimoto:
A study of speech recognition for the elderly. 101-104 - Jie Zhu, Feili Chen:
The analysis and application of a new endpoint detection method based on distance of autocorrelated similarity#. 105-108
Articulatory Measurements and Modelling
- Denis Beautemps, Pascal Borel, Sébastien Manolios:
Hyper-articulated speech: auditory and visual intelligibility. 109-112 - Olov Engwall:
Modeling of the vocal tract in three dimensions. 113-116 - Miriam Kienast, Astrid Paeschke, Walter F. Sendlmeier:
Articulatory reduction in emotional speech. 117-120 - Tokihiko Kaburagi, Masaaki Honda, Takeshi Okadome:
A trajectory formation model of articulatory movements using a multidimensional phonemic task. 121-124 - Sacha Krstulovic:
LPC-based inversion of the DRM articulatory model. 125-128 - Nobuhiro Miki, Thoru Yokoyama, Takeshi Ohtani, Shinobu Masaki, Ikuhiro Shimada, Ichiro Fujimoto, Yuji Nakamura:
A vocal tract model using multi-line equivalent circuits. 129-132 - Masahiro Matsuda, Hideki Kasuya:
Acoustic nature of the whisper. 133-136 - Takeshi Okadome, Tokihiko Kaburagi, Masaaki Honda:
Relations between utterance speed and articulatory movements. 137-140 - Slim Ouni, Yves Laprie:
Design of hypercube codebooks for the acoustic-to-articulatory inversion respecting the non-linearities of the articulatory-to-acoustic mapping. 141-144 - Marie Owens, Anja Kürger, Paul Gerard Donnelly, Francis Jack Smith, Ji Ming:
A missing-word test comparison of human and statistical language model performance. 145-148 - Korin Richmond:
Estimating velum height from acoustics during continuous speech. 149-152 - Carlos Silva, Samir Chennoukh, Isabel Trancoso:
On improving the decision algorithm for articulatory codebook search. 153-156 - Georg Thimm, Juergen Luettin:
Extraction of articulators in x-ray image sequences. 157-160 - António J. S. Teixeira, Francisco A. C. Vaz, José Carlos Príncipe:
Effects of source-tract interaction in perception of nasality. 161-164 - Béatrice Vaxelaire, Rudolph Sock, Véronique Hecker:
Perceiving anticipatory phonetic gestures in French. 165-168 - Anne Vilain, Christian Abry, Pierre Badin:
Motor equivalence evidenced by articulatory modelling. 169-172
First and Second Language Learning
- Febe de Wet, Catia Cucchiarini, Helmer Strik, Lou Boves:
Using likelihood ratios to perform utterance verification in automatic pronunciation assessment. 173-176 - Goh Kawai, Carlos Toshinori Ishi:
A system for learning the pronunciation of Japanese pitch accent. 177-182 - Jan Nouza:
Computer-aided spoken-language training with enhanced visual and auditory feedback. 183-186 - Zhanjiang Song, Fang Zheng, Mingxing Xu, Wenhu Wu:
An effective scoring method for speaking skill evaluation system. 187-190 - Conception Santiago-Oriola:
Vocal synthesis in a computerized dictation exercise. 191-194 - Isabel Trancoso, Céu Viana, Isabel Mascarenhas, Carlos Teixeira:
On deriving rules for nativised pronunciation in navigation queries. 195-198 - François Yvon:
Pronouncing unknown words using multi-dimensional analogies. 199-202 - Maria Laczko:
Characteristics features of planning of speech and production of secondary schoolchildren's spontaneous speech. 202
Speech Recognition - Adaptation 2
- William Byrne, Asela Gunawardana:
Discounted likelihood linear regression for rapid adaptation. 203-206 - Jen-Tzung Chien, Jean-Claude Junqua, Philippe Gelin:
Extraction of reliable transformation parameters for unsupervised speaker adaptation. 207-210 - Cristina Chesta, Olivier Siohan, Chin-Hui Lee:
Maximum a posteriori linear regression for hidden Markov model adaptation. 211-214 - Ramalingam Hariharan, Olli Viikki:
On combining vocal tract length normalisation and speaker adaptation for noise robust speech recognition. 215-218 - Jörg Rottland, Christoph Neukirchen, Daniel Willett, Gerhard Rigoll:
Speaker adaptation using regularization and network adaptation for hybrid MMI-NN/HMM speech recognition. 219-222
Prosody - Prosodic Phrasing and Interruptions
- Kai Alter, Annett Schirmer, Sonja A. Kotz, Angela D. Friederici:
Prosodic phrasing and accentuation in speech production of patients with right hemisphere lesions. 223-226 - Masataka Goto, Katunobu Itou, Satoru Hayamizu:
A real-time filled pause detection system for spontaneous speech recognition. 227-230 - Koji Iwano:
Prosodic word boundary detection using mora transition modeling of fundamental frequency contours -speaker independent experiments-. 231-234 - Volker Warnke, Florian Gallwitz, Anton Batliner, Jan Buckow, Richard Huber, Elmar Nöth, A. Höthker:
Integrating multiple knowledge sources for word hypotheses graph interpretation. 235-238 - Li-chiung Yang:
Prosodic correlates of interruptions in spoken dialogue. 239-242
Assessment
- Paul C. Constantinides, Alexander I. Rudnicky:
Dialog analysis in the carnegie mellon communicator. 243-246 - Jon Fiscus, George R. Doddington, John S. Garofolo, Alvin F. Martin:
NIST's 1998 topic detection and tracking evaluation (TDT2). 247-250 - Gerit P. Sonntag, Thomas Portele, Felicitas Haas, Joachim Köhler:
Comparative evaluation of six German TTS systems. 251-254 - Herman J. M. Steeneken:
Standardisation of ergonomic assessment of speech communication. 255-258 - Noriko Suzuki, Yugo Takeuchi, Kazuo Ishii, Michio Okada:
Evaluation of affiliation in interaction with autonomous creatures. 259-262
Speech Recognition - Confidence Measures 2
- Josef G. Bauer, Jochen Junkawitsch:
Accurate recognition of city names with spelling as a fall back strategy. 263-266 - Katarina Bartkova, Denis Jouvet:
Selective prosodic post-processing for improving recognition of French telephone numbers. 267-270 - Eric I. Chang:
Improving rejection with semantic slot-based confidence scores. 271-274 - K. Davies, Robert E. Donovan, Mark Epstein, Martin Franz, Abraham Ittycheriah, Ea-Ee Jan, Jean-Michel LeRoux, David M. Lubensky, Chalapathy Neti, Mukund Padmanabhan, Kishore Papineni, Salim Roukos, Andrej Sakrajda, Jeffrey S. Sorensen, Borivoj Tydlitát, Todd Ward:
The IBM conversational telephony system for financial applications. 275-278 - Rachida El Méliani, Douglas D. O'Shaughnessy:
Error spotting using syllabic fillers in spontaneous conversational speech recognition. 279-282 - Denis Jouvet, Jean Monné:
Recognition of spelled names over the telephone and rejection of data out of the spelling lexicon. 283-286 - Myoung-Wan Koo, Sun-Jeong Lee:
An utterance verification system based on subword modeling for a vocabulary independent speech recognition system. 287-290 - Nicolas Moreau, Denis Jouvet:
Use of a confidence measure based on frame level likelihood ratios for the rejection of incorrect data. 291-294 - Javier Macías Guarasa, Javier Ferreiros, Ascensión Gallardo-Antolín, Rubén San Segundo, José Manuel Pardo, Luis Villarrubia Grande:
Variable preselection list length estimation using neural networks in a telephone speech hypothesis-verification system. 295-298 - Thilo Pfau, Robert Faltlhauser, Günther Ruske:
Speaker normalization and pronunciation variant modeling: helpful methods for improving recognition of fast speech. 299-302 - Richard C. Rose, Giuseppe Riccardi:
Automatic speech recognition using acoustic confidence conditioned language models. 303-306 - Volker Strom, Henrik Heine:
Utilizing prosody for unconstrained morpheme recognition. 307-310 - Andreas Stolcke, Elizabeth Shriberg, Dilek Hakkani-Tür, Gökhan Tür:
Modeling the prosody of hidden events for improved word recognition. 311-314 - Frank Wessel, Klaus Macherey, Hermann Ney:
A comparison of word graph and n-best list based confidence measures. 315-318
Speech Analysis and Tools
- Marcus M. Prätzas, Ulrich Balss, Herbert Reininger, Harald Wüst:
C++ software environment for speech signal processing. 319-322 - Kun Ma, Pelin Demirel, Carol Y. Espy-Wilson, Joel MacAuslan:
Improvement of electrolaryngeal speech by introducing normal excitation information. 323-326 - Abraham Ittycheriah, Richard J. Mammone:
Detecting user speech in barge-in over prompts using speaker identification methods. 327-330 - Boris Lobanov, T. Levkovskaya, Igor E. Kheidorov:
Speaker and channel-normalized set of formant parameters for telephone speech recognition. 331-334 - Alan Wee-Chung Liew, K. L. Sum, S. H. Leung, Wing Hong Lau:
Fuzzy segmentation of lip image using cluster analysis. 335-338 - Michael F. McTear:
Software to support research and development of spoken dialogue systems. 339-342 - Sachin S. Kajarekar, Narendranath Malayath, Hynek Hermansky:
Analysis of sources of variability in speech. 343-346 - Tetsuya Shimamura, Haruko Hayakawa:
Adaptive nonlinear prediction based on order statistics for speech signals. 347-350 - Márcio N. de Souza, E. J. Caprini, Glaucio J. Couri Machado, M. V. Ludolf, Luiz Pereira Calôba, José Manoel de Seixas, Fernando Gil Resende, Sergio L. Netto, Diamantino Rui da Silva Freitas, João Paulo Ramos Teixeira, Carlos Espain, Vitor Pera, F. Moreira:
Developing a voiced information retrieval system for the portuguese language capable to handle both brazilian and portuguese spoken versions. 351-354 - John J. Soraghan, Amir Hussain, Ivy Shim:
Real-time speech modeling using computationally efficient locally recurrent neural networks (CERNs). 355-358 - M. Tokuhira, Yasuo Ariki:
Effectiveness of KL-transformation in spectral delta expansion. 359-362
Language Identification
- Kay Berkling, Douglas A. Reynolds, Marc A. Zissman:
Evaluation of confidence measures for language identification. 363-366 - Wuei-He Tsai, Wen-Whei Chang:
Chinese dialect identification using an acoustic-phonotactic model. 367-370 - Fred A. Cummins, Felix A. Gers, Jürgen Schmidhuber:
Language identification from prosody without explicit features. 371-374 - Stefan Harbeck, Uwe Ohler:
Multigrams for language identification. 375-378 - Jean-Marie Hombert, Ian Maddieson:
The use of 'rare' segments for language identification. 379-382 - Shuichi Itahashi, Toshikazu Kiuchi, Mikio Yamamoto:
Spoken language identification utilizing fundamental frequency and cepstra. 383-386 - Driss Matrouf, Martine Adda-Decker, Jean-Luc Gauvain, Lori Lamel:
Comparing different model configurations for language identification using a phonotactic approach. 387-390 - Kazuya Mori, N. Toba, T. Harada, Takayuki Arai, Masahiko Komatsu, Makiko Aoyagi, Yuji Murahara:
Human language identification with reduced spectral information. 391-394 - Melissa Barkat, John J. Ohala, François Pellegrino:
Prosody as a distinctive feature for the discrimination of arabic dialects. 395-398 - François Pellegrino, Jérôme Farinas, Régine André-Obrecht:
Comparison of two phonetic approaches to language identification. 399-402
Speech Recognition - Speaking Rate
- Stephen W. Anderson, Natalie Liberman, Larry Gillick, Stephen Foster, Sahoko Hama:
The effects of speaker training on ASR accuracy. 403-406 - Robert Faltlhauser, Thilo Pfau, Günther Ruske:
Creating hidden Markov models for fast speech by optimized clustering. 407-410 - Matthew Richardson, Mei-Yuh Hwang, Alex Acero, Xuedong Huang:
Improvements on speech recognition for fast talkers. 411-414 - Lawrence K. Saul, Mazin G. Rahim:
Modeling the rate of speech by Markov processes on curves. 415-418 - Andreas Tuerk, Steve J. Young:
Modelling speaking rate using a between frame distance metric. 419-422
Speech Acoustics
- Gerrit Bloothooft, Peter Pabon:
Vocal registers revisited. 423-426 - Franck Bouteille, Pascal Scalart, Michel Corazza:
Pseudo affine projection algorithm new solution for adaptive identication. 427-430 - Luis M. T. Jesus, Christine H. Shadle:
Acoustic analysis of a speech corpus of european portuguese fricative consonants. 431-434 - Natalia Petlyuchenko:
Acoustic characteristics of plosives in consonant-consonant sequences at word boundaries. 435-438 - R. J. J. H. van Son, Louis C. W. Pols:
Effects of stress and lexical structure on speech efficiency. 439-442
Speech Recognition - Search and Pronunciation Modelling
- Yoshiharu Abe, Hiroyasu Itsui, Yuzo Maruta, Kunio Nakajima:
A two-stage speech recognition method with an error correction model. 443-446 - C. Julian Chen:
Speech recognition with automatic punctuation. 447-450 - Ellen Eide:
Automatic modeling of pronunciation variations. 451-454 - Martin Franz, Miroslav Novak:
Reducing search complexity in low perplexity tasks. 455-458 - Paolo Coletti, Marcello Federico:
A two-stage speech recognition method for information retrieval applications. 459-462 - Eric Fosler-Lussier:
Multi-level decision trees for static and dynamic pronunciation models. 463-466 - Michael Finke, Jürgen Fritsch, Detlef Koll, Alex Waibel:
Modeling and efficient decoding of large vocabulary conversational speech. 467-470 - Jean-Luc Husson:
Evaluation of a segmentation system based on multi-level lattices. 471-474 - Philip Hanna, Darryl Stewart, Ji Ming:
The application of an improved DP match for automatic lexicon generation. 475-478 - Rukmini Iyer, Owen Kimball, Herbert Gish:
Modeling trajectories in the HMM framework. 479-482 - Oh-Wook Kwon, Kyuwoong Hwang, Jun Park:
Korean large vocabulary continuous speech recognition using pseudomorpheme units. 483-486 - Harouna Kabré, Alexander Waibel:
Navigating German cities by spontaneous French queries. 487-490 - Filipp Korkmazskiy, Chin-Hui Lee:
Generating alternative pronunciations from a dictionary. 491-494 - Lidia Mangu, Eric Brill, Andreas Stolcke:
Finding consensus among words: lattice-based word error minimization. 495-498 - Stefan Ortmanns, Wolfgang Reichl, Wu Chou:
An efficient decoding method for real time speech recognition. 499-502 - Mukund Padmanabhan, George Saon, Sankar Basu, Jing Huang, Geoffrey Zweig:
Recent improvements in voicemail transcription. 503-506 - Bhuvana Ramabhadran, Sabine Deligne, Abraham Ittycheriah:
Acoustics-based baseform generation with pronunciation and/or phonotactic models. 507-510 - Yasuo Shirosaki, Hideaki Kikuchi, Katsuhiko Shirai:
Improving recognition correct rate of important words in large vocabulary speech recognition. 511-514 - Murat Saraclar, Harriet J. Nock, Sanjeev Khudanpur:
Pronunciation modeling by sharing gaussian densities across phonetic models. 515-518 - Xavier L. Aubert:
One pass cross word decoding for large vocabularies based on a lexical tree search organization. 1559-1562
Prosody - Stress, Accent and Prominence Phrasing
- Anton Batliner, M. Nutt, Volker Warnke, Elmar Nöth, Jan Buckow, Richard Huber, Heinrich Niemann:
Automatic annotation and classification of phrase accents in spontaneous speech. 519-522 - Alistair Conkie, Giuseppe Riccardi, Richard C. Rose:
Prosody recognition from speech utterances using acoustic and linguistic based models of prosodic events. 523-526 - Marcus L. Fach:
A comparison between syntactic and prosodic phrasing. 527-530 - Barbara Gili Fivela:
The prosody of left-dislocated topic constituents in italian read speech. 531-534 - Jürgen Haas, Volker Warnke, Heinrich Niemann, Mauro Cettolo, Anna Corazza, Daniele Falavigna, Gianni Lazzari:
Semantic boundaries in multiple languages. 535-538 - Yeon-Jun Kim, Heo-Jin Byeon, Yung-Hwan Oh:
Prosodic phrasing in korean, determine governor, and then split or not. 539-542 - Joachim Mersdorf, Kai U. Schmidt, Stefanie Köster:
Linear prediction coding of individual pitch accent shapes. 543-546 - Christine H. Nakatani:
Prominence variation beyond given/new. 547-550 - Barbertje M. Streefkerk, Louis C. W. Pols, Louis ten Bosch:
Acoustical features as predictors for prominence in read aloud dutch sentences used in ANN's. 551-554 - Mariët Theune:
Parallelism, coherence, and contrastive accent. 555-558
Speech Disorders & Speech for Disabled
- Anne Bonneau, Parham Mokhtari:
A phonetically-guided diagnosis of auditory deficiency based on synthetic speech stimuli. 559-562 - Juan Ignacio Godino-Llorente, Santiago Aguilera-Navarro, Carlos Hernández-Espinosa, Mercedes Fernández-Redondo, Pedro Gómez Vilda:
On the selection of meaningful speech parameters used by a pathologic/non pathologic voice register classifier. 563-566 - Erik Harborg, Trym Holter, Magne Hallstein Johnsen, Torbjørn Svendsen:
On-line captioning of TV-programs for the hearing impaired. 567-570 - Cheol-Woo Jo, Dae-Hyun Kim:
Classification of pathological voice into normal/benign/malignant state. 571-574 - Ichiro Maruyama, Yoshiharu Abe, Eiji Sawamura, Tetsuo Mitsuhashi, Terumasa Ehara, Katsuhiko Shirai:
Cognitive experiments on timing lag for superimposing closed captions. 575-578 - Marcel Ogner, Zdravko Kacic:
Speaker normalization for audio-visual articulation training. 579-582 - Tatjana Prizl-Jakovac:
Vowel production in aphasia. 583-586
Speech Recognition - Multi-stream ASR
- Christophe Cerisara, Jean Paul Haton, Dominique Fohr:
Towards a global optimization scheme for multi-band speech recognition. 587-590 - Adam Janin, Dan Ellis, Nelson Morgan:
Multi-stream speech recognition: ready for prime time? 591-594 - Nikki Mirghafori, Nelson Morgan:
Sooner or later: exploring asynchrony in multi-band speech recognition. 595-598 - Andrew C. Morris, Astrid Hagen, Hervé Bourlard:
The full combination sub-bands approach to noise robust HMM/ANN based ASR. 599-602 - Shigeki Okawa, Takehiro Nakajima, Katsuhiko Shirai:
A recombination strategy for multi-band speech recognition based on mutual information criterion. 603-606
Speech Generation and Synthesis - Concatenation
- Mark C. Beutnagel, Mehryar Mohri, Michael Riley:
Rapid unit selection from a large speech corpus for concatenative speech synthesis. 607-610 - Jing-Dong Chen, Nick Campbell:
Objective distance measures for assessing concatenative speech synthesis. 611-614 - Eric Lewis, Mark Tatham:
Word and syllable concatenation in text-to-speech synthesis. 615-618 - Karlheinz Stöber, Thomas Portele, Petra Wagner, Wolfgang Hess:
Synthesis by word concatenation. 619-622 - Paul Taylor, Alan W. Black:
Speech synthesis by phonological structure matching. 623-626
Speech Communication Education
- Gerrit Bloothooft:
The implementation of a european masters in language and speech. 627-630 - Martin Cooke, Helen Parker, Guy J. Brown, Stuart N. Wrigley:
The interactive auditory demonstrations project. 631-634 - Michael F. McTear:
Curricula and courseware in spoken language engineering in europe: a critical appraisal. 635-638 - Rüdiger Hoffmann, Bettina Ketzmerick, Ulrich Kordon, Steffen Kürbis:
An interactive tutorial on text-to-speech synthesis from diphones in time domain. 639-642 - Pernilla Qvarfordt, Arne Jönsson:
Evaluating the dialogue component in the GULAN educational system. 643-646
Speech Recognition - Broadcast News
- Peter Beyerlein, Xavier L. Aubert, Reinhold Haeb-Umbach, Matthew Harris, Dietrich Klakow, Andreas Wendemuth, Sirko Molau, Michael Pitz, Achim Sixtus:
The philips/RWTH system for transcription of broadcast news. - Jason Davenport, Long Nguyen, Spyros Matsoukas, Richard M. Schwartz, John Makhoul:
Toward realtime transcription of broadcast news. - Jean-Luc Gauvain, Lori Lamel, Gilles Adda, Michèle Jardino:
Recent advances in transcribing television and radio broadcasts. - Photina Jaeyun Jang, Alexander G. Hauptmann:
Selection for acoustic coverage from unlimited speech extracted from closed-captioned TV. - Paul E. Kennedy, Alexander G. Hauptmann:
Laughter extracted from television closed captions as speech recognizer training data. - Long Nguyen, Spyros Matsoukas, Jason Davenport, Daben Liu, Jay Billa, Francis Kubala, John Makhoul:
Further advances in transcription of broadcast news. - Katsutoshi Ohtsuki, Sadaoki Furui, Naoyuki Sakurai, Atsushi Iwasaki, Zhipeng Zhang:
Recent advances in Japanese broadcast news transcription. - Michael Pitz, Sirko Molau:
Automatic verification of broadcast news transcriptions. - Alain Tritschler, Ramesh A. Gopinath:
Improved speaker segmentation and segments clustering using the bayesian information criterion. - Xintian Wu, Yonghong Yan:
Development of the 1998 OGI-FONIX broadcast news transcription system. - Gethin Williams, Daniel P. W. Ellis:
Speech/music discrimination based on posterior probability features. - Steven Wegmann, Puming Zhan, Ira Carp, Michael Newman, Jon Yamron, Larry Gillick:
Dragon systems' 1998 broadcast news transcription system. - Hua Yu, Michael Finke, Alex Waibel:
Progress in automatic meeting transcription.
Prosody - Temporal and/or Intonational Features
- Christel Brindöpke, Gernot A. Fink, Franz Kummert:
A comparative study of HMM-based approaches for the automatic recognition of perceptually relevant aspects of spontaneous German speech melody. 699-710 - Grazyna Demenko, Wiktor Jassem:
Modelling intonational phrase structure with artificial neural networks. 711-714 - Danielle Duez:
Effects of articulation rate on duration in read French speech. 715-718 - Jorge A. Gurlekian, Marcela Leticia Riccillo, Alejandro Renato, José A. Alvarez:
A semi automatic method for the characterization of Spanish intonation contours. 719-722 - Hesham Tolba, Douglas D. O'Shaughnessy:
Towards recognizing "non-lexical" words in spontaneous conversational speech. 723-726 - Mitsuaki Isogai, Hideyuki Mizuno:
A new F0 contour control method based on vector representation of F0 contour. 727-730 - Jana Klecková:
Developing the database of the spontaneous speech prosody characteristics. 731-734 - Gregor Möhler, Jörg Mayer:
A method for the analysis of prosodic registers. 735-738 - Natalia Smirnova:
Whole tunes, nuclear and pre-nuclear patterns and prosodic features in the perception of interrogativity and non-finality in dutch. 739-742 - Wern-Jun Wang, Yuan-Fu Liao, Sin-Horng Chen:
Prosodic modeling of Mandarin speech and its application to lexical decoding. 743-746 - Jin-Song Zhang, Hiromichi Kawanami:
Modeling carryover and anticipation effects for Chinese tone recognition. 747-750
Speaker Recognition - Acoustic Features and Robustness
- Laurent Besacier, Juergen Luettin, Gilbert Maître, Eric Meurville:
Experimental evaluation of text-independent speaker verification on laboratory and field test databases in the M2VTS project. 751-754 - Rajesh Balchandran, Vidhya Ramanujam, Richard J. Mammone:
Channel estimation and normalization by coherent spectral averaging for robust speaker verification. 755-758 - Ivan Magrin-Chagnolleau, Geoffrey Durou:
Time-frequency principal components of speech: application to speaker identification. 759-762 - Marcos Faúndez-Zanuy:
Speaker recognition by means of a combination of linear and nonlinear predictive models. 763-766 - Gil-Jin Jang, Seong-Jin Yun, Yung-Hwan Oh:
Feature vector transformation using independent component analysis and its application to speaker identification. 767-770 - Yizhar Lavner, Judith Rosenhouse, Isak Gath:
The prototype model in speaker identification. 771-774 - T. F. Lo, Man-Wai Mak, Kwok-Kwong Yiu:
A new cepstrum-based channel compensation method for speaker verification. 775-778 - Chiyomi Miyajima, Hideyuki Watanabe, Tadashi Kitamura, Shigeru Katagiri:
Speaker recognition based on discriminative feature extraction - optimization of mel-cepstral features using second-order all-pass warping function. 779-782 - Javier Ortega-Garcia, Santiago Cruz-Llanas, Joaquin Gonzalez-Rodriguez:
Facing severe channel variability in forensic speaker verification conditions. 783-786 - Thomas F. Quatieri, Elliot Singer, Robert B. Dunn, Douglas A. Reynolds, Joseph P. Campbell:
Speaker and language recognition using speech codec parameters. 787-790 - Vidhya Ramanujam, Rajesh Balchandran, Richard J. Mammone:
Robust speaker verification in noisy conditions by modification of spectral time trajectories. 791-794 - Rivarol Vergin, Douglas D. O'Shaughnessy, Pierre Dumouchel:
Toward parametric representation of speech for speaker recognition systems. 795-798 - Ran D. Zilca, Yuval Bistritz:
Text independent speaker identification using LSP codebook speaker models and linear discriminant functions. 799-802
Speech Recognition - Large Vocabulary Continuous Speech Recognition (LVCSR)
- Chiwei Che, Nick J.-C. Wang, Max Huang, Hank Huang, Frank Seide:
Development of the philips 1999 taiwan Mandarin benchmark system. 803-806 - Andrej Ljolje, Michael D. Riley, Donald Hindle:
The AT&t large vocabulary conversational speech recognition system. 807-810 - Mehryar Mohri, Michael Riley:
Integrated context-dependent networks in very large vocabulary speech recognition. 811-814 - Jürgen Reichert, Tanja Schultz, Alex Waibel:
Mandarin large vocabulary speech recognition using the globalphone database. 815-818 - Fang Zheng, Zhanjiang Song, Mingxing Xu, Jian Wu, Yinfei Huang, Wenhu Wu, Cheng Bi:
Easytalk: a large-vocabulary speaker-independent Chinese dictation machine. 819-822
Speech Generation and Synthesis - Systems and Evaluation
- Susan Fitt, Stephen Isard:
Synthesis of regional English using a keyword lexicon. 823-826 - Noriyasu Maeda, Hideki Banno, Shoji Kajita, Kazuya Takeda, Fumitada Itakura:
Speaker conversion through non-linear frequency warping of straight spectrum. 827-830 - Fergus R. McInnes, David Attwater, Michael D. Edgington, Mark S. Schmidt, Mervyn A. Jack:
User attitudes to concatenated natural speech and text-to-speech synthesis in an automated information service. 831-834 - Christof Traber, Karl Huber, Karim Nedir, Beat Pfister, Eric Keller, Brigitte Zellner:
From multilingual to polyglot speech synthesis. 835-838 - Kimihito Tanaka, Hideyuki Mizuno, Masanobu Abe, Shin'ya Nakajima:
A Japanese text-to-speech system based on multi-form units with consideration of frequency distribution in Japanese. 839-842
Speech Technology for Language Learning
- G. Deville, Olivier Deroo, Henri Leich, Stan C. A. M. Gielen, Johan Vanparys:
Automatic detection and correction of pronunciation errors for foreign language learners: the demosthenes application. 843-846 - Maxine Eskénazi, Scott Hansma, John Corwin, Jordi Albornoz:
User adaptation in the fluency pronunciation trainer. 847-850 - Horacio Franco, Leonardo Neumeyer, María Ramos, Harry Bratt:
Automatic detection of phone-level mispronunciation for language learning. 851-854 - Daniel Herron, Wolfgang Menzel, Eric Atwell, Roberto Bisiani, Fabio Daneluzzi, Rachel Morton, Juergen A. Schmidt:
Automatic localization and diagnosis of pronunciation errors for second-language learners of English. 855-858 - Klára Vicsi, Peter Roach, Anne-Marie Öster, Zdravko Kacic, Peter Barczikay, I. Sinka:
SPECO - a multimedia multilingual teaching and training system for speech handicapped children. 859-862
Speech Recognition - Multilinguality
- S. M. Ahadi:
Recognition of continuous persian speech using a medium-sized vocabulary speech corpus. 863-866 - Tibor Fegyó, Péter Tatai:
Multi-lingual speech recognition based on demi-syllable subword units. 867-870 - Pascale Fung, Ma Chi Yuen, Wai Kat Liu:
MAP-based cross-language adaptation augmented by linguistic knowledge: from English to Chinese. 871-874 - Stefan Grocholewski:
Analysis of HMM models in alphabet letters recognition. 875-878 - Keikichi Hirose, Jinsong Zhang:
Tone recognition of Chinese continuous speech using tone critical segments. 879-882 - Tai-Hsuan Ho, Chin-Jung Liu, Herman Sun, Ming-Yi Tsai, Lin-Shan Lee:
Phonetic state tied-mixture tone modeling for large vocabulary continuous Mandarin speech recognition. 883-886 - Bojan Imperl, Bogomir Horvat:
The clustering algorithm for the definition of multilingual set of context dependent speech models. 887-890 - Jian Liu, Xiaodong He, Fuyuan Mo, Tiecheng Yu:
Study on tone classification of Chinese continuous speech in speech recognition system. 891-894 - Yi Liu, Pascale Fung:
Decision tree-based triphones are robust and practical for mandarian speech recognition. 895-898 - Karmele López de Ipiña, Amparo Varona, Inés Torres, Luis Javier Rodríguez:
Decision trees for inter-word context dependencies in Spanish continuous speech recognition tasks. 899-902 - Amin M. Nassar, Nemat S. Abdel Kader, Amr M. Refat:
End points detection for noisy speech using a wavelet based algorithm. 903-906 - Christoph Nieuwoudt, Elizabeth C. Botha:
Adaptation of acoustic models for multilingual recognition. 907-910 - Ulla Uebler, Manuela Boros:
Recognition of non-native German speech with multilingual recognizers. 911-914
Systems, Architectures, Interfaces
- Toomas Altosaar, J. Bruce Millar, Martti Vainio:
Relational vs. object-oriented models for representing speech: a comparison using ANDOSL data. 915-918 - Christoph Draxler, Robert Grudszus, Stephan Euler, Klaus Bengler:
First experiences of the German speechdat-car database collection in mobile environments. 919-922 - Mike Edgington, David Attwater, Peter J. Durston:
OASIS - a framework for spoken language call steering. 923-926 - Eike Gegenmantel:
VOCAPI - small standard API for command & control. 927-930 - Christel Müller, Karsten Schröder:
Standardised speech interfaces - key for objective evaluation of recognition accuracy. 931-934 - Shoichi Matsunaga, Yoshiaki Noda, Katsutoshi Ohtsuki, Eiji Doi, Tomio Itoh:
A medical rehabilitation diagnoses transcription method that integrates continuous and isolated word recognition. 935-938 - Géza Németh, Csaba Zainkó, Gábor Olaszy, Gábor Prószéky:
Problems of creating a flexible e-mail reader for hungarian. 939-942 - Gábor Olaszy, Géza Németh, Péter Olaszi, Géza Gordos:
Interactive, TTS supported speech message composer for large, limited vocabulary, but open information systems. 943-946 - Gerald Penn, Bob Carpenter:
ALE for speech: a translation prototype. 947-950 - Luis Javier Rodríguez, M. Inés Torres, José M. Alcaide, Amparo Varona, Karmele López de Ipiña, Mikel Peñagarikano, Germán Bordel:
An integrated system for Spanish CSR tasks. 951-954 - Angelien Sanderman, Ellen Bosgoed, Hans de Graaff, Peter van Splunder:
Use of speech synthesis in an application. 955-958 - Masatsune Tamura, Shigekazu Kondo, Takashi Masuko, Takao Kobayashi:
Text-to-audio-visual speech synthesis based on parameter generation from HMM. 959-962 - Johan Wouters, Brian Rundle, Michael W. Macon:
Authoring tools for speech synthesis using the sable markup standard. 963-966
Speaker Recognition - Scoring and Decision
- Aladdin M. Ariyaeeinia, P. Sivakumaran, M. Pawlewski, Martin J. Loomes:
Dynamic weighting of the distortion sequence in text-dependent speaker verification. 967-970 - Hakan Altinçay, Mübeccel Demirekler:
On the use of supra model information from multiple classifiers for robust speaker identification. 971-974 - Mounir El-Maliki, Andrzej Drygajlo:
Missing features detection and handling for robust speaker verification. 975-978 - Nikos Fakotakis, John Sirigos, George Kokkinakis:
High performance text-independent speaker recognition system based on voiced/unvoiced segmentation and multiple neural nets. 979-982 - Corinne Fredouille, Jean-François Bonastre, Téva Merlin:
Similarity normalization method based on world model and a posteriori probability for speaker verification. 983-986 - Toshihiro Isobe, Jun-ichi Takahashi:
Text-independent speaker verification using virtual speaker based cohort normalization. 987-990 - Juergen Luettin, S. Ben-Yacoub:
Robust person verification based on speech and facial images. 991-994 - M. Mathew, B. Yegnanarayana, R. Sundar:
A neural network-based text-dependent speaker verification system using suprasegmental features. 995-998 - Jason W. Pelecanos, Sridha Sridharan:
Modelling output probability distributions for enhancing speaker recognition. 999-1002 - Leandro Rodríguez Liñares, Carmen García-Mateo, José Luis Alba-Castro:
On the use of neural networks to combine utterance and speaker verification systems in a text-dependent speaker verification task. 1003-1006 - Belén Ruíz-Mezcua, R. Rodríguez-Galán, Luis A. Hernández Gómez, Paloma Domingo-García, Enrique Bailly-Baillicre Gutiérrez:
Genesys: a neural network model for speaker identification. 1007-1010 - Bogdan Sabac, Inge Gavat:
Speaker verification with growing cell structures. 1011-1014 - Chakib Tadj, Pierre Dumouchel, Mohamed Mihoubi, Pierre Ouellet:
Environment adaptation and long term parameters in speaker identification. 1015-1018 - Kenichi Yoshida, Kazuyuki Takagi, Kazuhiko Ozeki:
Speaker identification using subband HMMS. 1019-1022 - W. D. Zhang, Kwok-Kwong Yiu, Man-Wai Mak, C. K. Li, M. X. He:
A priori threshold determination for phrase-prompted speaker verification. 1023-1026
Speech Recognition - Broadcast News
- Matthew Harris, Xavier L. Aubert, Reinhold Haeb-Umbach, Peter Beyerlein:
A study of broadcast news audio stream segmentation and segment clustering. 1027-1030 - Daben Liu, Francis Kubala:
Fast speaker change detection for broadcast news transcription and indexing. 1031-1034 - David D. Palmer, Mari Ostendorf, John D. Burger:
Robust information extraction from spoken language data. 1035-1038 - Steve Renals, Yoshihiko Gotoh:
Integrated transcription and identification of named entities in broadcast speech. 1039-1042 - Philip C. Woodland, J. J. Odell, Thomas Hain, Gareth L. Moore, Thomas Niesler, Andreas Tuerk, Edward W. D. Whittaker:
Improvements in accuracy and speed in the HTK broadcast news transcription system. 1043-1046
Speech Generation and Synthesis - Acoustic Synthesis
- Alex Acero:
Formant analysis and synthesis using hidden Markov models. 1047-1050 - Gérard Bailly:
Accurate estimation of sinusoidal parameters in an harmonic+noise model for speech synthesis. 1051-1054 - Unto K. Laine:
Modal synthesis and modeling of vowels. 1055-1058 - Darragh O'Brien, Alex I. C. Monaghan:
Shape invariant pitch modification of speech using a harmonic model. 1059-1062 - Mark C. Beutnagel, Alistair Conkie:
Interaction of units in a unit selection database. 1063-1066
Disorders in Speech Production and/or Speech Perception
- Ramón García Gómez, Ricardo López Barquilla, José Ignacio Puertas Tera, José Parera Bermudez, Marie-Christine Haton, Jean Paul Haton, Pierre Alinat, Sofia Moreno, Wolfgang Hess, Ma Araceli Sanchez Raya, Eduardo Alberto Martínez Gual, Juan Luis Navas-Chaveli Daza, Christophe Antoine, Marie-Madeleine Durel, Genevieve Maugin, Silke Hohmann:
Speech training for deaf and hearing-impaired people. 1067-1070 - Shinichi Hoshino, Itaru Kaneko, Hideaki Kikuchi, Katsuhiko Shirai:
A post-processing of speech for hearing impaired integrate into standard digital audio decoders. 1071-1074 - Setsuko Imatomi, Takayuki Arai, Yuko Mimura, Masako Kato:
Effects of hoarseness on hypernasality ratings. 1075-1078 - N. Rezaei-Aghbash, Sandra P. Whiteside, P. A. Cudd:
Cross-language analysis of voice onset time in stuttered speech. 1079-1082
Speech Recognition - Acoustic Modelling 1
- Gilles Boulianne, Julie Brousseau, Nathalie Talbot, Pierre Dumouchel:
Experiments in constrained maximum likelihood extraction of temporal features for speech recognition. - Scott Saobing Chen, Ramesh A. Gopinath:
Model selection in acoustic modeling. - Yiu Wing Wong, Ka-Fai Chow, Wai H. Lau, Wai Kit Lo, Tan Lee, Pak-Chung Ching:
Acoustic modeling and language modeling for cantonese LVCSR. - Olivier Deroo, Christophe Ris, Stéphane Dupont:
Context dependent hybrid HMM/ANN systems for large vocabulary continuous speech recognition system. - V. Fischer, T. Roß:
Reduced gaussian mixture models in a large vocabulary continuous speech recognizer. - J. Fritsch:
Mixture trees - hierarchically tied mixture densities for modeling HMM emission probabilities. - Akira Ichikawa, Tomoyuki Shimizu, Yasuo Horiuchi:
Reinforcement learning for phoneme recognition. - Paul M. McCourt, Naomi Harte, Saeed Vaseghi:
Combined temporal and spectral multi-resolution phonetic modelling. - Miroslav Novak, Michael Picheny:
Speed improvement of the time-asynchronous acoustic fast match. - John Sirigos, Nikos Fakotakis, George Kokkinakis:
A hybrid ANN/HMM syllable recognition module based on vowel spotting. - Michael L. Shire:
Data-driven modulation filter design under adverse acoustic conditions and using phonetic and syllabic units. - Wei Xu, Jacques Duchateau, Kris Demuynck, Ioannis Dologlou, Patrick Wambacq, Dirk Van Compernolle, Hugo Van hamme:
Accuracy versus complexity in context dependent phone modeling. - Jianlai Zhou, Xiaodong He, Tiecheng Yu, Fuyuan Mo:
A new hybrid structure of speech recognizer based on HMM and neural network. - Geoffrey Zweig, Mukund Padmanabhan:
Dependency modeling with bayesian networks in a voicemail transcription system.
Dialogue 1
- Hideki Asoh, Toshihiro Matsui, John Fry, Futoshi Asano, Satoru Hayamizu:
A spoken dialog system for a mobile office robot. 1139-1142 - Linda Bell, Joakim Gustafson:
Interaction with an animated agent in a spoken dialogue system. 1143-1146 - Niels Ole Bernsen, Laila Dybkjær, Ulrich Heid:
Current practice in the development and evaluation of spoken language dialogue systems. 1147-1150 - Joakim Gustafson, Nikolaj Lindberg, Magnus Lundeberg:
The august spoken dialogue system. 1151-1154 - Olivier Grisvard, Bertrand Gaiffe:
An event-based dialogue model and its implementation in multidial2. 1155-1158 - Chao Huang, Peng Xu, Xin Zhang, Shubin Zhao, Taiyi Huang, Bo Xu:
LODESTAR: a Mandarin spoken dialogue system for travel information retrieval. 1159-1162 - Munehiko Sasajima, Takehide Yano, Yasuyuki Kono:
EUROPA: a generic framework for developing spoken dialogue systems. 1163-1166 - Mikio Nakano, Kohji Dohsaka, Noboru Miyazaki, Jun-ichi Hirasawa, Masafumi Tamoto, Masahito Kawamori, Akira Sugiyama, Takeshi Kawabata:
Handling rich turn-taking in spoken dialogue systems. 1167-1170 - Hannes Pirker, Georg Loderer, Harald Trost:
Thus spoke the user to the wizard. 1171-1174 - Andrew N. Pargellis, Hong-Kwang Jeff Kuo, Chin-Hui Lee:
Automatic dialogue generator creates user defined applications. 1175-1178 - José Relaño-Gil, Daniel Tapias, Juan Manuel Villar-Navarro, Maria C. Gancedo, Luis A. Hernández Gómez:
Flexible mixed-initiative dialogue for telephone services. 1179-1182 - Gert Veldhuijzen van Zanten:
User modelling in adaptive dialogue management. 1183-1186
Speaker Recognition and Topic Detection
- Gal Ashour, Isak Gath:
Characterization of speech during imitation. 1187-1190 - Evgeny I. Bovbel, Polina P. Tkachova, Igor E. Kheidorov:
The analysis of speaker individual features based on autoregressive hidden Markov models. 1191-1194 - Perrine Delacourt, David Kryze, Christian Wellekens:
Detection of speaker changes in an audio document. 1195-1198 - Axel Glaeser:
Dynamic test durations for text-independent speaker verification systems. 1199-1202 - Guido Kolano, Peter Regel-Brietzmann:
Combination of vector quantization and gaussian mixture models for speaker verification with sparse training data. 1203-1206 - Qi Li, Augustine Tsai, Weon-Goo Kim:
A language-independent personal voice controller with embedded speaker verification. 1207-1210 - Johan Lindberg, Mats Blomberg:
Vulnerability in speaker verification - a study of technical impostor techniques. 1211-1214 - Jack McLaughlin, Douglas A. Reynolds, Terry P. Gleason:
A study of computation speed-UPS of the GMM-UBM speaker recognition system. 1215-1218 - Stéphane H. Maes:
Conversational biometrics. 1219-1222 - Takashi Masuko, Takafumi Hitotsumatsu, Keiichi Tokuda, Takao Kobayashi:
On the security of HMM-based speaker verification systems against imposture using synthetic speech. 1223-1226 - Wojciech Majewski, Grazyna Mazur-Majewska:
Speech signal parametrization for speaker recognition under voice disguise conditions. 1227-1230 - Antonio Satué-Villar, Marcos Faúndez-Zanuy:
On the relevance of language in speaker recognition. 1231-1234 - Yoichi Yamashita:
Prediction of keyword spotting accuracy based on simulation. 1235-1238
Speech Recognition - Search
- María José Castro, David Llorens, Joan-Andreu Sánchez, Francisco Casacuberta, Pablo Aibar, Encarna Segarra:
A fast version of the atros system. 1239-1242 - Vaibhava Goel, William Byrne:
Task dependent loss functions in speech recognition: a* search over recognition lattices. 1243-1246 - Václav Hanzl:
Theory of structured cogitation in speech recognition. 1247-1250 - Andrej Ljolje, Fernando Pereira, Michael Riley:
Efficient general lattice generation and rescoring. 1251-1254 - Mingxing Xu, Fang Zheng, Wenhu Wu:
A fast and effective state decoding algorithm. 1255-1258
Systems, Architectures
- Philippe Jeanrenaud, Greg Cockroft, Allard VanderHeidjen:
A multimodal, multilingual telephone application: the wildfire electronic assistant. 1259-1262 - Els den Os, Hans Jongebloed, Alice Stijsiger, Lou Boves:
Speaker verification as a user-friendly access for the visually impaired. 1263-1266 - Tony Robinson, Dave Abberley, David Kirby, Steve Renals:
Recognition, indexing and retrieval of british broadcast news with the THISL system. 1267-1270 - Stephanie Seneff, Raymond Lau, Joseph Polifroni:
Organization, communication, and control in the GALAXY-II conversational system. 1271-1274 - Claudia Pateras, Nicolas Chapados, Remi Kwan, Dominic Lavoie, Réal Tremblay:
A mixed-initiative natural dialogue system for conference room reservation. 1275-1278
Audio-Visual Speech
- Takaaki Kuratate, Kevin G. Munhall, Philip Rubin, Eric Vatikiotis-Bateson, Hani Yehia:
Audio-visual synthesis of talking faces from speech production correlates. 1279-1282 - John MacDonald, Soren Andersen, Talis Bachmann:
Hearing by eye: visual spatial degradation and the mcgurk effect. 1283-1286 - Yoshihiko Nankaku, Keiichi Tokuda, Tadashi Kitamura:
Intensity- and location-normalized training for HMM-based visual speech recognition. 1287-1290 - Gerasimos Potamianos, Alexandros Potamianos:
Speaker adaptation for audio-visual speech recognition. 1291-1294 - Monique Radeau, Cécile Colin:
The role of spatial separation on ventriloquism and mcgurk illusions. 1295-1298
Speech Recognition - Acoustic Modelling 2
- María José Castro, Francisco Casacuberta:
Hybrid connectionist-structural acoustical modeling in the ATROS system. - Lionel Delphin-Poulat, Jérôme Idier:
Path-dependent kalman estimation of a cepstral bias. - Simon Dobrisek, France Mihelic, Nikola Pavesic:
Acoustical modelling of phone transitions: biphones and diphones - what are the differences? - Kris Demuynck, Jacques Duchateau, Dirk Van Compernolle:
Optimal feature sub-space selection based on discriminant analysis. - Mohamed Debyeche, Mohamed Afify, Jean Paul Haton:
Phoneme recognition system based on HMM with distributed VQ codebook. - Xiaodong He, Jian Liu, Jianlai Zhou, Tiecheng Yu:
Research on speech units modeling in continuous speech recognition. - Reinhold Haeb-Umbach, Marco Loog:
An investigation of cepstral parameterisations for large vocabulary speech recognition. - Thomas Hain, Philip C. Woodland:
Dynamic HMM selection for continuous speech recognition. - Li Jiang, Xuedong Huang:
Unified decoding and feature representation for improved speech recognition. - DongHwa Kim, Chaojun Liu, Xintian Wu, Yonghong Yan:
High accuracy acoustic modeling based on multi-stage decision tree. - Jeff Z. Ma, Li Deng:
Optimization of dynamic regimes in a statistical hidden dynamic model for conversational speech recognition. - José B. Mariño, Albino Nogueiras Rodríguez:
Top-down bottom-up hybrid clustering algorithm for acoustic-phonetic modeling of speech. - Atsushi Nakamura, Tomoko Matsui:
Acoustic modeling based on a generalized laplacian distribution. - Klaus Reinhard, Mahesan Niranjan:
Diphone subspace models for phone-based HMM complementation. - Harald Singer, Atsushi Nakamura:
Unified framework for acoustic topology modelling: ML-SSS and question-based decision trees. - Bernard Doherty, Saeed Vaseghi, Paul M. McCourt:
Linear transformations in sub-band groups for speech recognition. - Peng-Ren Lu, Wei-Tyng Hong, Sheng-Lun Chiang, Yih-Ru Wang, Sin-Horng Chen:
A prototype of Mandarin speech telephone number inquiry system. - Silke M. Witt, Steve J. Young:
Off-line acoustic modelling of non-native accents. - Colin W. Wightman, Ted A. Harder:
Semi-supervised adaptation of acoustic models for large-volume dictation.
Dialogue 2
- Egbert Ammicht, Allen L. Gorin, Tirso Alonso:
Knowledge collection for natural language spoken dialog systems. 1375-1378 - Donna K. Byron:
Improving discourse management in TRIPS-98. 1379-1382 - Chung-Hsien Wu, Gwo-Lang Yan, Chien-Liang Lin:
Speech act modeling in a spoken dialogue system using fuzzy hidden Markov model and bayes' decision criterion. 1383-1386 - Ute Ehrlich:
Task hierarchies representing sub-dialogs in speech dialog systems. 1387-1390 - Jun-ichi Hirasawa, Mikio Nakano, Takeshi Kawabata, Kiyoaki Aikawa:
Effects of system barge-in responses on user impressions. 1391-1394 - Ramón López-Cózar, Antonio J. Rubio, Pedro García-Teodoro, José C. Segura:
A new word-confidence threshold technique to enhance the performance of spoken dialogue systems. 1395-1398 - Carine-Alexia Lavelle, Martine de Calmès, Guy Perennou:
Confirmation strategies to improve correction rates in a telephonic inquiry dialogue system. 1399-1402 - Yasuhisa Niimi, Takuya Nishimoto:
Mathematical analysis of dialogue control strategies. 1403-1406 - Jana Ocelíková, Václav Matousek:
Processing of anaphoric and elliptic sentences in a spoken dialog system. 1407-1410 - Kishore Papineni, Salim Roukos, Todd Ward:
Free-flow dialog management using forms. 1411-1414 - Klaus Ries:
Towards the detection and description of textual meaning indicators in spontaneous conversations. 1415-1418 - Janienke Sturm, Els den Os, Lou Boves:
Dialogue management in the dutch ARISE train timetable information system. 1419-1422 - Emiel Krahmer, Marc Swerts, Mariët Theune, Mieke F. Weegels:
Problem spotting in human-machine interaction. 1423-1426 - Bor-Shen Lin, Hsin-Min Wang, Lin-Shan Lee:
Consistent dialogue across concurrent topics based on an expert system model. 1427-1430
Speech Coding
- Thomas M. Chapman, Costas S. Xydeas:
Secondary codebook storage quantisation. 1431-1434 - William H. Edmondson, Dorota J. Iskra, P. Kienzle:
Pseudo-articulatory representations: promise, progress and problems. 1435-1438 - Ge Gao, P. C. Ching:
A 1.7KBPS waveform interpolation speech coder using decomposition of pitch cycle waveform. 1439-1442 - Oded Gottesman, Allen Gersho:
Enhanced analysis-by-synthesis waveform interpolative coding at 4 KBPS. 1443-1446 - Norbert Görtz:
Joint source-channel decoding by channel-coded optimal estimation (CCOE) for a CELP speech codec. 1447-1450 - Chunyan Li, Allen Gersho, Vladimir Cuperman:
Analysis-by-synthesis low-rate multimode harmonic speech coding. 1451-1454 - László Lois:
Variable length coding of transformed LSF coefficients. 1455-1458 - R. Mayrench, David Malah:
Low bit-rate speech coding using quantization of variable length segments. 1459-1462 - Rainer Martin, Hong-Goo Kang, Richard V. Cox:
Low delay analysis/synthesis schemes for joint speech enhancement and low bit rate speech coding. 1463-1466 - Oscar Oliva, Marcos Faúndez-Zanuy:
A comparative study of several ADPCM schemes with linear and nonlinear prediction. 1467-1470 - Hiroshi Ohmura, Kazuyo Tanaka:
Segmental feature extraction and coding for speech synthesis. 1471-1474 - Carmen Peláez-Moreno, Fernando Díaz-de-María:
Backward adaptive RBF-based hybrid predictors for CELP-type coders at medium bit-rates. 1475-1478 - Valentin V. Sercov, Alexander A. Petrovsky:
An improved speech model with allowance for time-varying pitch harmonic amplitudes and frequencies in low bit-rate MBE coders. 1479-1482 - Davor Petrinovic, Davorka Petrinovic:
Sparse vector linear prediction matrices with multidiagonal structure. 1483-1486 - Milos Stefanovic, Ahmet M. Kondoz:
Source-dependent variable rate speech coding below 3 KBPS. 1487-1490 - Xiaoping Chen, Yantao Song, Tiecheng Yu:
A novel speech coding approach based on half-wave vector quantization. 1491-1494 - Parham Zolfaghari, Tony Robinson:
Speech coding using mixture of gaussians polynomial model. 1495-1498
Speech Recognition - Acoustic Modelling 1
- Li Deng, Jeff Z. Ma:
A statistical coarticulatory model for the hidden vocal-tract-resonance dynamics. 1499-1502 - Dario Albesano, Renato de Mori, Roberto Gemello, Franco Mana:
A study on the effect of adding new dimensions to trajectories in the acoustic space. 1503-1506 - Mark J. F. Gales, Peder A. Olsen:
Tail distribution modelling using the richter and power exponential distributions. 1507-1510 - Qingwei Zhao, Zuoying Wang, Dajin Lu:
A study of duration in continuous speech recognition based on DDBHMM. 1511-1514 - Tzur Vaich, Arnon Cohen:
Comparison of continuous-density and semi-continuous HMM in isolated words recognition systems. 1515-1518
Dialogue
- Jennifer Chu-Carroll:
Form-based reasoning for mixed-initiative dialogue management in information-query systems. 1519-1522 - Nils Dahlbäck, Arne Jönsson:
Knowledge sources in spoken dialogue systems. 1523-1526 - Els den Os, Lou Boves, Lori Lamel, Paolo Baggia:
Overview of the ARISE project. 1527-1530 - Alexander I. Rudnicky, Eric H. Thayer, Paul C. Constantinides, Chris Tchou, R. Shern, Kevin A. Lenzo, W. Xu, Alice Oh:
Creating natural dialogs in the carnegie mellon communicator system. 1531-1534 - Sophie Rosset, Samir Bennacef, Lori Lamel:
Design strategies for spoken language dialog systems. 1535-1538
Wideband and Perceptually Based Coding
- A. Amodio, Gang Feng:
A wideband speech coder based on harmonic coding at 16KBS. 1539-1542 - Alexis Bernard, Abeer Alwan:
Perceptually based and embedded wideband CELP coding of speech. 1543-1546 - Rudolf Földvári, László Gyimesi:
Very low bit rate voice coder based on a nonlinear hearing model. 1547-1550 - Marcos Perreau Guimaraes, Madeleine Bonnet, Nicolas Moreau:
Low complexity bit allocation algorithm with psychoacoustical optimisation. 1551-1554 - Wanggen Wan, Oscar C. Au, Cyan L. Keung, Chi H. Yim:
A novel approach of low bit-rate speech coding based on sinusoidal representation and auditory model. 1555-1558
Speech Recognition - Language Modelling
- Christel Beaujard, Michèle Jardino:
Language modeling based on automatic word concatenations. - Ciprian Chelba, Frederick Jelinek:
Recognition performance of a structured language model. - Vincent Chow, Dekai Wu:
On the use of right context in sense-disambiguating language models. - Paul Gerard Donnelly, Francis Jack Smith, Elvira I. Sicilia-Garcia, Ji Ming:
Language modelling with hierarchical domains. - Géraldine Damnati:
Integration of several information sources for robust class-based statistical language modelling. - Marcello Federico:
Efficient language model adaptation through MDI estimation. - Arnaud Gaudinat, Jean-Philippe Goldman, Eric Wehrli:
Syntax-based speech recognition: how a syntactic parser can help a recognition system. - Akinori Ito, Masaki Kohda, Mari Ostendorf:
A new metric for stochastic language model evaluation. - Hong-Kwang Jeff Kuo, Wolfgang Reichl:
Phrase-based language models for speech recognition. - Norihiko Kobayashi, Tetsunori Kobayashi:
Class-combined word n-gram for robust language modeling. - Ciro Martins, João Paulo Neto, Luís B. Almeida:
Using partial morphological analysis in language modeling estimation for large vocabulary portuguese speech recognition. - Uwe Ohler, Stefan Harbeck, Heinrich Niemann:
Discriminative training of language model classifiers. - Shuwu Zhang, Harald Singer, Dekai Wu, Yoshinori Sagisaka:
Improving n-gram modeling using distance-related unit association maximum entropy language modeling.
Prosody - Study of Prosody for Speech Synthesis
- Aimin Chen, Shu Lian Wong, Saeed Vaseghi, Charles Ho:
Decision tree micro-prosody structures for text to speech synthesis. 1615-1618 - Ricardo de Córdoba, José A. Vallejo, Juan Manuel Montero, Juana M. Gutiérrez-Arriola, M. A. López, José Manuel Pardo:
Automatic modeling of duration in a Spanish text-to-speech system using neural networks. 1619-1622 - Robert A. J. Clark, Kurt E. Dusterhoff:
Objective methods for evaluating synthetic intonation. 1623-1626 - Kurt E. Dusterhoff, Alan W. Black, Paul Taylor:
Using decision trees within the tilt intonation model to predict F0 contours. 1627-1630 - Richard Esposito, Li-chiung Yang:
Levels of prosodic representation in spoken discourse: an empirical approach. 1631-1634 - Xavier Fernández Salgado, Eduardo Rodríguez Banga:
Segmental duration modelling in a text-to-speech system for the galician language. 1635-1638 - Daniel Hirst:
The symbolic coding of segmental duration and tonal alignment: an extension to the INTSINT system. 1639-1642 - Yann Morlec, Gérard Bailly, Véronique Aubergé:
Training an application-dependent prosodic model corpus, model and evaluation. 1643-1646 - Hamid Sheikhzadeh, A. Eshkevari, M. Khayatian, Mohammad Reza Sadigh, Seyed Mohammad Ahadi:
Farsi language prosodic structure, research and implementation using a speech synthesizer. 1647-1650 - João Paulo Ramos Teixeira, Elisabete Rosa Paulo, Diamantino Freitas, Maria da Graca Pinto:
Acoustical characterisation of the accented syllable in portuguese, a contribution to the naturalness of speech synthesis. 1651-1654 - Changfu Wang, Hiroya Fujisaki, Sumio Ohno, Tomohiro Kodama:
Analysis and synthesis of the four tones in connected speech of the standard Chinese based on a command-response model. 1655-1658 - Sandra Williams, Catherine I. Watson:
A profile of the discourse and intonational structures of route descriptions. 1659-1662
Speech Perception 1
- Shigeaki Amano, Tadahisa Kondo:
Neighborhood effects on spoken word recognition in Japanese. 1663-1666 - C. Chéreau, Pierre A. Hallé, Juan Segui:
Interference between surface form and abstract representation in spoken word perception. 1667-1670 - Cécile Colin, Monique Radeau:
Are the mcgurk illusions affected by left or right presentation of the speaker face? 1671-1674 - Emmanuel Dupoux, Takao Fushimi, Kazuhiko Kakehi, Jacques Mehler:
Prelexical locus of an illusory vowel effect in Japanese. 1675-1678 - Sergio Feijóo, Santiago Fernández, Nieves Barros, Ramón Balsa:
Acoustic and perceptual characteristics of the Spanish fricatives. 1679-1686 - Philippe Gelin, Jean-Claude Junqua:
Techniques for robust speech recognition in the car environment. - Fredrik Karlsson, Anders Eriksson:
Difference limen for formant frequency discrimination at high fundamental frequencies. 1687-1690 - Eduardo Sá Marta, Luís Vieira de Sá:
Auditory features for human communication of stop consonants under full-band and low-pass conditions. 1691-1694 - Christina Widera, Thomas Portele:
Levels of reduction for German tense vowels. 1695-1698
Speech Recognition - Acoustic Modelling 2
- Peter V. de Souza, Bhuvana Ramabhadran, Yuqing Gao, Michael Picheny:
Enhanced likelihood computation using regression. 1699-1702 - Chaojun Liu, Xintian Wu, Yonghong Yan:
High accuracy acoustic modeling using two-level decision-tree based state-tying. 1703-1706 - Rita Singh, Bhiksha Raj, Richard M. Stern:
Domain adduced state tying for cross-domain acoustic modelling. 1707-1710 - Ananth Sankar, Venkata Ramana Rao Gadde:
Parameter tying and gaussian clustering for faster, better, and smaller speech recognition. 1711-1714 - Ralf Schlüter, Wolfgang Macherey, Boris Müller, Hermann Ney:
A combined maximum mutual information and maximum likelihood approach for mixture density splitting. 1715-1718
Multimodal Interaction
- Luc E. Julia, Adam Cheyer:
Is talking to virtual more realistic? 1719-1722 - Yosuke Matsusaka, Tsuyoshi Tojo, Sentaro Kubota, Kenji Furukawa, Daisuke Tamiya, Keisuke Hayata, Yuichiro Nakano, Tetsunori Kobayashi:
Multi-person conversation via multi-modal interface - a robot who communicate with multi-user -. 1723-1726 - Shrikanth S. Narayanan, Alexandros Potamianos, Haohong Wang:
Multimodal systems for children: building a prototype. 1727-1730 - Michio Okada, Noriko Suzuki, Masaaki Date:
Social bonding in talking with social autonomous creatures. 1731-1734 - A. Purson, Serge Santi, Roxane Bertrand, Isabelle Guaïtella, J. Boyer, Christian Cavé:
The relationships between voice and gesture: eyebrow movements and questioning. 1735-1738
Joint Source-Channel Coding
- Wen-Whei Chang, Heng-Iang Hsu, De-Yu Wang:
Robust vector quantization for channels with memory. 1739-1742 - Balázs Kövesi, Claude Lamblin, Catherine Quinquis, Philippe Thiérion, William Navarro:
A multi-rate codec family based on GSM EFR and ITU-t g.729. 1743-1746 - Jeng-Shyang Pan, Chin-Shiuh Shieh, T. F. Chiang:
A novel channel distortion measure for vector quantization and a fuzzy model for codebook index assignment. 1747-1750 - C. Sriratanaban, Ahmet M. Kondoz:
A full-rate GSM-AMR candidate. 1751-1754 - Stephane Villette, Milos Stefanovic, Ahmet M. Kondoz:
A multi-rate speech and channel codec: a GSM AMR half-rate candidate. 1755-1758
Speech Recognition - Language Modelling
- Gilles Adda, Michèle Jardino, Jean-Luc Gauvain:
Language modeling for broadcast news transcription. 1759-1762 - Frédéric Béchet, Alexis Nasr, Thierry Spriet, Renato de Mori:
Large Span statistical language models: application to homophone disambiguation for large vocabulary speech recognition in French. 1763-1766 - Paolo Baggia, Andreas Kellner, Guy Perennou, Cosmin Popovici, Janienke Sturm, Frank Wessel:
Language modelling and spoken dialogue systems - the ARISE experience. 1767-1770 - Laure Brieussel-Pousse, Guy Perennou:
Language model level vs. lexical level for modeling pronunciation variation in a French CSR. 1771-1774 - Roger Ho-Yin Leung, Chi-Yan Choy, Hong C. Leung:
Characteristics of Chinese language models for large vocabulary telephone speech. 1775-1778 - David Langlois, Kamel Smaïli:
A new based distance language model for a dictation machine: application to MAUD. 1779-1782 - Ludek Müller, Josef Psutka:
Using various language model smoothing techniques for the transcription of a weather forecast broadcasted by the czech radio. 1783-1786 - Don McAllaster, Larry Gillick:
Studies in acoustic training and language modeling using simulated speech data. 1787-1790 - Wolfgang Reichl:
Language model adaptation using minimum discrimination information. 1791-1794 - Kamel Smaïli, Armelle Brun, Imed Zitouni, Jean Paul Haton:
Automatic and manual clustering for large vocabulary speech recognition: a comparative study. 1795-1798 - Joan-Andreu Sánchez, José-Miguel Benedí:
Learning of stochastic context-free grammars by means of estimation algorithms. 1799-1802 - Hirofumi Yamamoto, Yoshinori Sagisaka:
Part-of-speech n-gram and word n-gram fused language model. 1803-1806 - Xiaojin Zhu, Stanley F. Chen, Ronald Rosenfeld:
Linguistic features for whole sentence maximum entropy language models. 1807-1810 - Imed Zitouni, Jean-François Mari, Kamel Smaïli, Jean Paul Haton:
Variable-length sequence language model for large vocabulary continuous dictation machine. 1811-1814 - Ruiqiang Zhang, Ezra Black, Andrew M. Finch:
Using detailed linguistic structure in language modelling. 1815-1818
Speech Generation and Synthesis - Prosody
- Ivan Bulyko, Mari Ostendorf:
Predicting gradient F0 variation: pitch range and accent prominence. 1819-1822 - Paul Deans, Andrew P. Breen, Peter Jackson:
CART-based duration modeling using a novel method of extracting prosodic features. 1823-1826 - Ki-Wan Eom, Jin-Young Kim, Sun-Mi Kim:
A primary study on the randomness control of the prosodic boundary index for natural synthetic speech. 1827-1830 - Attila Ferencz, István Nagy, Tunde-Csilla Kovács, Teodora Ratiu, Maria Ferencz:
On a hybrid time domain-LPC technique for prosody superimposing used for speech synthesis. 1831-1834 - Justin Fackrell, Halewijn Vereecken, Jean-Pierre Martens, Bert Van Coile:
Multilingual prosody modelling using cascades of regression trees and neural networks. 1835-1838 - Wentao Gu, Chilin Shih, Jan P. H. van Santen:
An efficient speaker adaptation method for TTS duration model. 1839-1842 - David House, Linda Bell, Kjell Gustafson, Linn Johansson:
Child-directed speech synthesis: evaluation of prosodic variation for an educational computer program. 1843-1846 - Mark A. Huckvale:
Representation and processing of linguistic structures for an all-prosodic synthesis system using XML. 1847-1850 - Won Park, Hyung-Bin Park, Myung-Jin Bae:
A study on a pitch alteration by using the formant and phase compensation technique. 1851-1854 - Tan Lee, Helen M. Meng, Wai H. Lau, Wai Kit Lo, P. C. Ching:
Micro-prosodic control in cantonese text-to-speech synthesis. 1855-1858 - Hansjörg Mixdorff, Dieter Mehnert:
Exploring the naturalness of several German high-quality-text-to-speech systems. 1859-1862 - Atsuhiro Sakurai, Hiromichi Kawanami, Keikichi Hirose:
Detecting accent sandhi in Japanese using a superpositional F0 model. 1863-1866 - Satoshi Kitagawa, Nick Campbell:
Focus detection by comparison of speech waveforms. 1867-1870 - Mark Tatham, Eric Lewis, Katherine Morton:
An advanced intonation model for synthesis. 1871-1874 - Satoshi Takano, Masanobu Abe:
A new F0 modification algorithm by manipulating harmonics of magnitude spectrum. 1875-1878
Speech Perception 2
- William A. Ainsworth:
Perception of overlapping syllables. 1883-1886 - Loredana Cerrato, Andrea Paoloni:
Are transcriptions of speech material recorded by means of bugs reliable? 1887-1890 - Vlasta Erdeljac, Damir Horga:
Influence of morphology on phoneme identification in spoken croatian. 1891-1894 - James J. Hant, Abeer Alwan:
Modeling the masking of formant transitions in noise. 1895-1898 - Toshio Irino, Roy D. Patterson:
Stabilised wavelet mellin transform: an auditory strategy for normalising sound-source size. 1899-1902 - Zdena Palková, Jitka Janíková:
Unintended preferences in the perceptive evaluation of rhythmical units in czech. 1903-1906 - Christophe Pallier, Núria Sebastián-Gallés, Angels Colomé:
Phonological representations and repetition priming. 1907-1910 - Klára Vicsi, Ferenc Csatári, Zsolt Bakcsi, Andras Tantos:
Distance score evaluation of the visualised speech spectra at audio-visual articulation training. 1911-1914 - David A. van Leeuwen, Michael de Louwere:
Objective and subjective evaluation of the acoustic models of a continuous speech recognition system. 1915-1918 - Sandra P. Whiteside, Rosemary A. Varley:
Verbo-motor priming in the phonetic encoding of real and non-words. 1919-1922
Speech Recognition - Language Modelling 1
- Langzhou Chen, Taiyi Huang:
An improved MAP method for language model adaptation. 1923-1926 - Philip Clarkson, Tony Robinson:
Towards improved language model evaluation measures. 1927-1930 - Taiyi Huang, Langzhou Chen:
A novel language model based on self-organized learning. 1931-1934 - Ute Kilian, Fritz Class:
Combining syntactical and statistical language constraints in context-dependent language models for interactive speech applications. 1935-1938 - Sven C. Martin, Christoph Hamacher, Jörg Liermann, Frank Wessel, Hermann Ney:
Assessment of smoothing methods and complex stochastic language modeling. 1939-1942
Speech and Noise
- Rolf Bippus, Alexander Fischer, Volker Stahl:
Domain adaptation for robust automatic speech recognition in car environments. 1943-1946 - Jun Huang, Yunxin Zhao, Stephen E. Levinson:
A DCT-based fast enhancement technique for robust speech recognition in automobile usage. 1947-1950 - Kris Hermus, Ioannis Dologlou, Patrick Wambacq, Dirk Van Compernolle:
Fully adaptive SVD-based noise removal for robust speech recognition. 1951-1954 - Martin Westphal, Alex Waibel:
Towards spontaneous speech recognition for on-board car navigation and information systems. 1955-1958 - Subrata K. Das, David M. Lubensky, Cheng Wu:
Towards robust speech recognition in the telephony network environment - cellular and landline conditions. 1959-1962
Text-Dependent Speaker Verification
- Frédéric Bimbot, Mats Blomberg, Lou Boves, Gérard Chollet, Cédric Jaboulet, Bruno Jacob, Jamal Kharroubi, Johan Koolwaaij, Johan Lindberg, Johnny Mariéthoz, Chafic Mokbel, Houda Mokbel:
An overview of the PICASSO project research activities in speaker verification for telephone applications. 1963-1966 - D. Charlet:
Integrating time-alignment information into the decision making for text-dependent HMM-based speaker verification. 1967-1970 - Dominique Genoud, Gérard Chollet:
Deliberate imposture: a challenge for automatic speaker verification systems. 1971-1974 - Håkan Melin, Johan Lindberg:
Variance flooring, scaling and tying for text-dependent speaker verification. 1975-1978 - Johnny Mariéthoz, Dominique Genoud, Frédéric Bimbot, Chafic Mokbel:
Client / world model synchronous alignement for speaker verification. 1979-1982
Speech Understanding - Miscellaneous Topics
- Manuela Boros, Paul Heisterkamp:
Linguistic phrase spotting in a simple application spoken dialogue system. 1983-1986 - Frank Deinzer, Julia Fischer, U. Ahlrichs, Elmar Nöth:
Learning of domain dependent knowledge in semantic networks. 1987-1990 - Dilek Hakkani-Tür, Gökhan Tür, Andreas Stolcke, Elizabeth Shriberg:
Combining words and prosody for information extraction from speech. 1991-1994 - Kai Ishikawa, Eiichiro Sumita:
Error correction translation using text corpora. 1995-1998 - Susanne Kronenberg, K. Skuplik:
Efficient sentence disambiguation by preferred constituent order. 1999-2002 - Yue-Shi Lee, Hsin-Hsi Chen:
Identifying linguistic segmentations in Chinese spoken dialogue. 2003-2006 - Tung-Hui Chiang, Yi-Chung Lin:
Error recovery for robust language understanding in spoken dialogue systems. 2007-2010 - Xiaohuo Liu, Pascale Fung, Chi Shun Cheung:
A monolingual semantic decoder based on word sense disambiguation for mixed language understanding. 2011-2014 - Helen M. Meng, Wai Lam, Carmen Wai:
To believe is to understand. 2015-2018 - Elmar Nöth, Jürgen Haas, Volker Warnke, Florian Gallwitz, Manuela Boros:
A hybrid approach to spoken dialogue understanding: prosody, statistics and partial parsing. 2019-2022 - Yasunari Obuchi, Atsuko Koizumi, Yoshinori Kitahara, Jun-Ichi Matsuda, Toshihisa Tsukada:
Portable speech interpreter which has voice input and sophisticated correction functions. 2023-2026 - Alexandros Potamianos, Giuseppe Riccardi, Shrikanth S. Narayanan:
Categorical understanding using statistical ngram models. 2027-2030 - Jörg Spilker, Hans Weber, Günther Görz:
Detection and correction of speech repairs in word lattices. 2031-2034 - Igor Schadle, Jean-Yves Antoine, Daniel Memmi:
Connectionist language models for speech understanding: the problem of word order variation. 2035-2038 - Kai-Chung Siu, Helen M. Meng:
Semi-automatic acquisition of domain-specific semantic structures. 2039-2042 - Toshiyuki Takezawa:
Transformation into language processing units by dividing and connecting utterance units. 2043-2046 - Aboy Wong, Dekai Wu:
Learning a lightweight robust deterministic parser. 2047-2050 - Dekai Wu, Zhifang Sui, Jun Zhao:
An information-based method for selecting feature types for word prediction. 2051-2054 - Ye-Yi Wang:
A robust parser for spoken language understanding. 2055-2058
Speech Generation and Synthesis - Systems, Linguistic Processing
- Plínio Almeida Barbosa, Fábio Violaro, Eleonora Cavalcante Albano, Flávio Simoes, Patrícia Aparecida Aquino, Sandra Madureira, Edson Françozo:
Aiuruete: a high-quality concatenative text-to-speech system for brazilian portuguese with demisyllabic analysis-based units and a hierarchical model of rhythm production. 2059-2062 - Dragos Burileanu, Claudius Dan, Mihai Sima, Corneliu Burileanu:
A parser-based text preprocessor for romanian language TTS synthesis. 2063-2066 - Alice Carlberger:
Nparse - a shallow n-gram-based grammatical-phrase parser. 2067-2070 - Evangelos Dermatas, George Kokkinakis:
A language-independent probabilistic model for automatic conversion between graphemic and phonemic transcription of words. 2071-2074 - Jerneja Gros, France Mihelic:
Acquisition of an extensive rule set for slovene grapheme-to-allophone transcription. 2075-2078 - Ching-Hsiang Ho, Saeed Vaseghi, Aimin Chen:
Voice conversion between UK and US accented English. 2079-2082 - Hideyuki Mizuno, Masanobu Abe, Shin'ya Nakajima:
Development of speech design tool "SESIGN99" to enhance synthesized speech. 2083-2086 - Horst-Udo Hain:
Automation of the training procedures for neural networks performing multi-lingual grapheme to phoneme conversion. 2087-2090 - Ilona Koutny:
Parsing hungarian sentences in order to determine their prosodic structures in a multilingual TTS system. 2091-2094 - Meelis Mihkla, Arvo Eek, Einar Meister:
Text-to-speech synthesis of estonian. 2095-2098 - Juan Manuel Montero, Juana M. Gutiérrez-Arriola, José Colás, Javier Macías Guarasa, Emilia Enríquez, José Manuel Pardo:
Development of an emotional speech synthesiser in Spanish. 2099-2102 - Nikola Pavesic, Jerneja Gros:
S5: the SQEL slovene speech synthesis system. 2103-2106 - Matej Rojc, Janez Stergar, Ralph Wilhelm, Horst-Udo Hain, Martin Holzapfel, Bogomir Horvat:
A multilingual text processing engine for the PAPAGENO text-to-speech synthesis system. 2107-2110 - Chang K. Suh, Takehiko Kagoshima, Masahiro Morita, Shigenobu Seto, Masami Akamine:
Toshiba English text-to-speech synthesizer (TESS). 2111-2114 - Frédérique Sannier, Véronique Aubergé:
Towards the generation of French phonetic inflected forms. 2115-2118 - Evelyne Tzoukermann, Lucie Ménard, Marise Ouellet:
Canadian French text-to-speech synthesis: modeling an optimal set of realizations for dialect markers. 2119-2122 - Bertjan Busser, Walter Daelemans, Antal van den Bosch:
Machine learning of word pronunciation: the case against abstraction. 2123-2126
Speech & the Internet
- Mark Ordowski, Neeraj Deshmukh, Aravind Ganapathiraju, Jonathan Hamaker, Joseph Picone:
A public domain speech-to-text system. - Klaus Fellbaum, Joerg Richter:
Human speech production - an internet-based interactive multimodal tutorial. - Dafydd Gibbon, Silke Kölsch, Inge Mertins, Michaela Schulte, Thorsten Trippel:
Terminology principles and support for spoken language system development. - Yasuo Horiuchi, Fujiwara Atsushi, Akira Ichikawa:
New WWW browser for visually impaired people using interactive voice technology. - Jirí Hanika, Petr Horák:
Text to speech control protocol. - Bojan Petek:
Multilinguality and human language technology courseware. - José Rouillard, Jean Caelen:
Multimodal information seeking dialogues on the world wide web. - Roger C. F. Tucker, Tony Robinson, James Christie:
Compression of acoustic features - are perceptual quality and recognition performance incompatible goals? - Dominique Vaufreydaz, José Rouillard, Mohammad Akbar:
A network architecture for building applications that use speech recognition and/or synthesis.
Speech Recognition - Language Modelling 2
- Jerome R. Bellegarda:
Context scope selection in multi-Span statistical language modeling. 2163-2166 - Daniel Gildea, Thomas Hofmann:
Topic-based language models using EM. 2167-2170 - Lucian Galescu, Eric K. Ringger:
Augmenting words with linguistic information for n-gram language models. 2171-2174 - Alexis Nasr, Yannick Estève, Frédéric Béchet, Thierry Spriet, Renato de Mori:
A language model combining n-grams and stochastic finite state automata. 2175-2178 - Jun Wu, Sanjeev Khudanpur:
Combining nonlocal, syntactic and n-gram dependencies in language modeling. 2179-2182
Speech Signal Processing
- Imre Kiss, Pekka Kapanen:
Robust feature vector compression algorithm for distributed speech recognition. 2183-2186 - Matti Karjalainen, Tero Tolonen:
Separation of speech signals using iterative multi-pitch analysis and prediction. 2187-2190 - Eluned S. Parris, Michael J. Carey, Harvey Lloyd-Thomas:
Feature fusion for music detection. 2191-2194 - Sarel van Vuuren, Hynek Hermansky:
Speech variability in the modulation spectral domain - SANOVA technique -. 2195-2198 - Dekun Yang, Georg F. Meyer, William A. Ainsworth:
Improving harmonic selection for speech intelligibility enhancement by the reassignment method. 2199-2202
Text-Independent Speaker Verification and Tracking
- Homayoon S. M. Beigi, Stéphane H. Maes, Upendra V. Chaudhari, Jeffrey S. Sorensen:
A hierarchical approach to large-scale speaker recognition. 2203-2206 - Jan Cernocký, Dijana Petrovska-Delacrétaz, Stéphane Pigeon, Patrick Verlinde, Gérard Chollet:
A segmental approach to text-independent speaker verification. 2207-2210 - Sue E. Johnson:
Who spoke when? - automatic segmentation and clustering for determining speaker turns. 2211-2214 - Mark A. Przybocki, Alvin F. Martin:
The 1999 NIST speaker recognition evaluation, using summed two-channel telephone data for speaker detection and speaker tracking. 2215-2218 - M. Kemal Sönmez, Larry P. Heck, Mitchel Weintraub:
Speaker tracking and detection with multiple speakers. 2219-2222
Corpora
- Demetrio Aiello, Loredana Cerrato, Cristina Delogu, Andrea Di Carlo:
The acquisition of a speech corpus for limited domain translation. - Yue-Shi Lee, Hsin-Hsi Chen:
Tagging spoken corpus. - Ferenc Csatári, Zsolt Bakcsi, Klára Vicsi:
A hungarian child database for speech processing applications. - Julie Carson-Berndsen:
A generic lexicon tool for word model definition in multimodal applications. - Steve Cassidy:
Compiling multi-tiered speech databases into the relational model: experiments with the emu system. - Kjell Elenius:
Two Swedish Speechdat databases - some experiences and results. - Satoru Hayamizu, Shigeki Nagaya, Keiko Watanuki, Masayuki Nakazawa, Shuichi Nobe, Takashi Yoshimura:
A multimodal database of gestures and speech. - Tomoko Matsui, Masaki Naito, Harald Singer, Atsushi Nakamura, Yoshinori Sagisaka:
Japanese spontaneous speech database with wide regional and age distribution. - Satoshi Nakamura, Kazuo Hiyane, Futoshi Asano, Takeshi Yamada, Takashi Endo:
Data collection in real acoustical environments for sound scene understanding and hands-free speech recognition. - Hiroaki Noguchi, Kazuhisa Kiriyama, Hiroshi Matsuda, Miki Taniguchi, Yasuharu Den, Yasuhiro Katagiri:
Automatic labeling of Japanese prosody using j-toBI style description. - Petr Pollák, Josef Vopièka, Pavel Sovka:
Czech language database of car speech and environmental noise. - Akira Kurematsu, Atsushi Sukenori:
Language model selection based on the analysis of Japanese spontaneous speech on travel arrangement task. - Florian Schiel, Christoph Draxler, Phil Hoole, Hans G. Tillmann:
New resources at BAS: acoustic, multimodal, linguistic. - Eric Sanders, Henk van den Heuvel, Khalid Choukri:
Building speech databases for cellular networks. - Henk van den Heuvel, Jérôme Boudy, Robrecht Comeyne, Stephan Euler, Asunción Moreno, Gaël Richard:
The speechdat-car multilingual speech databases for in-car applications: some first validation results. - Briony Williams:
A welsh speech database: preliminary results.
Speech Generation and Synthesis - Acoustic Synthesis and Units
- Oscar C. Au, Wanggen Wan, Cyan L. Keung, Chi H. Yim:
Sinusoidal representation and auditory model-based parametric matching and smoothing and its application in speech analysis/synthesis. 2287-2290 - Marcello Balestri, Alberto Pacchiotti, Silvia Quazza, Pier Luigi Salza, Stefano Sandri:
Choose the best to modify the least: a new generation concatenative synthesis system. 2291-2294 - Fu-Chiang Chou, Chiu-yu Tseng, Lin-Shan Lee:
Selection of waveform units for corpus-based Mandarin speech synthesis based on decision trees and prosodic modification costs. 2295-2298 - Borja Etxebarria, Inmaculada Hernáez, I. Madariaga, Eva Navas, J. C. Rodríguez, R. Gándara:
Improving quality in a speech synthesizer based on the MBROLA algorithm. 2299-2302 - Yan Huang, Bo Xu:
A novel model TD-PSPTP for speech synthesis. 2303-2306 - David A. Kapilow, Yannis Stylianou, Juergen Schroeter:
Detection of non-stationarity in speech signals and its application to time-scaling. 2307-2310 - Takao Koyama, Jun-ichi Takahashi:
A v-CV waveform based speech synthesis using global minimization of pitch conversion and concatenation distortion in v-CV unit sequence. 2311-2314 - Iain Mann, Steve McLaughlin:
Stable speech synthesis using recurrent radial basis functions. 2315-2318 - Yoram Meron, Keikichi Hirose:
Efficient weight training for selection based synthesis. 2319-2322 - Jindrich Matousek:
Speech synthesis using HMM-based acoustic unit inventory. 2323-2326 - Michael W. Macon, Mark A. Clements:
An enhanced ABS/OLA sinusoidal model for waveform synthesis in TTS. 2327-2330 - Marise Ouellet, Evelyne Tzoukermann, Lucie Ménard:
High vowel /i y u/ in canadian and continental French: an analysis for a TTS system. 2331-2334 - Zbynek Tychtl, Josef Psutka:
Speech production based on the mel-frequency cepstral coefficients. 2335-2338 - Erhard Rank:
Exploiting improved parameter smoothing within a hybrid concatenative/LPC speech synthesizer. 2339-2342 - Yannis Stylianou:
Synchronization of speech frames based on phase data with application to concatenative speech synthesis. 2343-2346 - Takayoshi Yoshimura, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura:
Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis. 2347-2350
Speech and Noise 1
- Hervé Glotin, Frédéric Berthommier, Emmanuel Tessier:
A CASA-labelling model using the localisation cue for robust cocktail-party speech recognition. 2351-2354 - Aruna Bayya, B. Yegnanarayana:
Noise-invariant representation for speech signals. 2355-2358 - Khaled El-Maleh, Peter Kabal:
Natural-quality background noise coding using residual substitution. 2359-2362 - Julián Fernández, Eduardo Lleida, Enrique Masgrau:
Microphone array design for robust speech acquisition and recognition. 2363-2366 - Gwénaël Guilmin, Régine Le Bouquin-Jeannès, Philippe Gournay:
Study of the influence of noise pre-processing on the performance of a low bit rate parametric speech coder. 2367-2370 - Hemmo Haverinen, Petri Salmela, Juha Häkkinen, Mikko Lehtokangas, Jukka Saarinen:
MLP network for enhancement of noisy MFCC vectors. 2371-2374 - Juha Iso-Sipilä, Kari Laurila, Ramalingam Hariharan, Olli Viikki:
Hands-free voice activation in noisy car environment. 2375-2378 - Lamia Karray, Emmanuel Polard:
A wavelet denoising technique to improve endpoint detection in adverse conditions. 2379-2382 - Marcin Kuropatwinski, Dieter Leckschat, Kristian Kroschel, Andrzej Czyzewski, Chaz Hales:
Speech enhancement for linear-predictive-analysis-by-synthesis coders. 2383-2386 - Hiroshi Matsumoto, Hiroaki Ubukata:
Robust HMM to variation of noisy environments based on variance extension of noise models. 2387-2390 - Elias Nemer, Rafik A. Goubran, Samy Mahmoud:
The fourth-order cumulant of speech signals with application to voice activity detection. 2391-2394 - Woei-Chyang Shieh, Sen-Chia Chang:
The dependence of feature vectors under adverse noise. 2395-2398 - Jürgen Tchorz, Birger Kollmeier:
Speech detection and SNR prediction basing on amplitude modulation pattern recognition. 2399-2402 - Luis Vicente, Stephen J. Elliott, Enrique Masgrau:
Fast active noise control for robust speech acquisition. 2403-2406 - Ascension Vizinho, Phil D. Green, Martin Cooke, Ljubomir Josifovski:
Missing data theory, spectral subtraction and signal-to-noise estimation for robust ASR: an integrated study. 2407-2410 - Rolf Vetter, Nathalie Virag, Philippe Renevey, Jean-Marc Vesin:
Single channel speech enhancement using principal component analysis and MDL subspace selection. 2411-2414
Speech Translation
- Sergio Barrachina, Juan Miguel Vilar:
Automatically deriving categories for translation. 2415-2418 - Anna Corazza:
An inter-domain portable approach to interchange format construction. 2419-2422 - Gustavo A. Casañ, Maria Asunción Castaño:
Distributed representation of vocabularies in the RECONTRA neural translator. 2423-2426 - Norbert Reithinger:
Robust information extraction in a speech translation system. 2427-2430 - Fumiaki Sugaya, Toshiyuki Takezawa, Akio Yokoo, Seiichi Yamamoto:
End-to-end evaluation in ATR-MATRIX: speech translation system between English and Japanese. 2431-2434
Topic Detection and Tracking
- Satya Dharanipragada, Martin Franz, J. Scott McCarley, Salim Roukos, Todd Ward:
Story segmentation and topic detection for recognized speech. 2435-2438 - Hubert Jin, Richard M. Schwartz, Sreenivasa Sista, Frederick Walls:
Topic tracking for radio, TV broadcast, and newswire. 2439-2442 - Stephen A. Lowe:
The beta-binomial mixture model for word frequencies in documents with applications to information retrieval. 2443-2446 - Masayuki Nakazawa, Jianxin Zhang, Ryuichi Oka:
Topic spotting and its description of summary from spontaneous speech. 2447-2450 - Frederick Walls, Hubert Jin, Sreenivasa Sista, Richard M. Schwartz:
Topic detection in broadcast news. 2451-2454
Speech & the Internet
- Chris Bowerman, Anders Eriksson, Mark A. Huckvale, Mike Rosner, Mark Tatham, Maria Wolters:
Criteria for evaluating internet tutorials in speech communication sciences. 2455-2458 - Andrzej Drygajlo, Guy Delafontaine:
Javaspeechlab - interactive speech analysis laboratory on the world-wide web. 2459-2462 - Vassilios Digalakis, Stavros Tsakalidis, Leonardo Neumeyer:
Reviving discrete HMMs: the myth about the superiority of continuous HMMs. 2463-2466 - Hiroya Fujisaki, Hiroyuki Kameda, Sumio Ohno, Kenji Abe, Michio Iijima, Masayoshi Suzuki, Kazunari Taketa:
Principles and design of an intelligent system for information retrieval over the internet with a multimodal dialogue interface. 2467-2470 - Takuya Nishimoto, Hidehiro Yuki, Takehiko Kawahara, Yasuhisa Niimi:
An asynchronous virtual meeting system for bi-directional speech dialog. 2471-2474
Speech Recognition - Adaptation
- Antonis Botinis, Marios Fourakis, Irini Prinou:
Prosodic effects on segmental durations in greek. 2475-2478 - Mats Blomberg:
Within-utterance correlation for speech recognition. 2479-2482 - Philippe Gelin, Jean-Claude Junqua:
Techniques for robust speech recognition in the car environment. 2483-2486 - Diego Giuliani:
An on-line acoustic compensation technique for robust speech recognition. 2487-2490 - Wei-Wen Hung, Hsiao-Chuan Wang:
Using adaptive signal limiter together with noise-robust techniques for noisy speech recognition. 2491-2494 - Wei-Tyng Hong, Sin-Horng Chen:
A robust environment-effects suppression training algorithm for adverse Mandarin speech recognition. 2495-2498 - Mikko Harju, Petri Salmela, Olli Viikki, Mikko Lehtokangas, Jukka Saarinen:
Robust speaker adaptation of continuous density HMMS using multilayer perceptron network. 2499-2502 - Chengrong Li, Jingdong Chen, Bo Xu:
Regression class selection and speaker adaptation with MLLR in Mandarin continuous speech recognition. 2503-2506 - Guoqiang Li, Limin Du, Ziqiang Hou:
Regression transformation of prior means for speaker adaptation. 2507-2510 - Liu Feng, Chi-wei Che, Peng Yu, Zuoying Wang:
Linguistic tree based maximum likelihood model interpolation. 2511-2514 - Masaki Naito, Li Deng, Yoshinori Sagisaka:
Model-based speaker normalization methods for speech recognition. 2515-2518 - Patrick Nguyen, Christian Wellekens, Jean-Claude Junqua:
Maximum likelihood eigenspace and MLLR for speech recognition in noisy environments. 2519-2522 - Yoshio Ono, Maki Yamada, Masakatsu Hoshimi:
A study of speaker adaptation for speaker independent speech recognition method using phoneme similarity vector. 2523-2526 - Luís Felipe Uebel, Philip C. Woodland:
An investigation into vocal tract length normalisation. 2527-2530 - Zong Suk Yuk, James L. Flanagan, Mahesh Krishnamoorthy, Krishna Dayanidhi:
Adaptation to environment and speaker using maximum likelihood neural networks. 2531-2534 - Xiuyang Yu, Wayne H. Ward:
Corrective training for speaker adaptation. 2535-2538 - Radovan Obradovic, Darko Pekar, Srdjan Krco, Vlado Delic, Vojin Senk:
A robust speaker-independent CPU-based ASR system. 2881-2884
Enhancements, Echo Cancellation, and Quality Measures
- Rabih Abouchakra, Peter Kabal:
Delay estimation for transform domain acoustical echo cancellation. 2539-2542 - Christophe Beaugeant, Pascal Scalart:
Noise reduction using perceptual spectral change. 2543-2546 - Amir Hussain, Douglas R. Campbell:
Intelligibility improvements using diverse sub-band processing applied to noisy speech. 2547-2550 - Athanasios Koutras, Evangelos Dermatas, George Kokkinakis:
Recognizing simultaneous speech: a genetic algorithm approach. 2551-2554 - Krzysztof Bielawski, Alexander A. Petrovsky:
Speech enhancement system for hands-free telephone based on the psychoacoustically motivated filter bank with allpass frequency transformation #. 2555-2558 - Paul W. Shields, Douglas R. Campbell:
Speech enhancement using a multi-microphone sub-band adaptive griffiths-jim noise canceller. 2559-2562 - Máté Szarvas, Tibor Fegyó, Péter Tatai, Géza Gordos:
Qualiphone-a: a perceptual speech quality evaluation system for analog mobile networks. 2563-2566 - Hiroshi Saruwatari, Shoji Kajita, Kazuya Takeda, Fumitada Itakura:
Speech enhancement using nonlinear microphone array under nonstationary noise conditions. 2567-2570 - Ruhi Sarikaya, John H. L. Hansen:
Auditory masking threshold estimation for broadband noise sources with application to speech enhancement. 2571-2574 - Masashi Unoki, Masato Akagi:
Segregation of vowel in background noise using the model of segregating two acoustic sources based on auditory scene analysis. 2575-2578 - Christophe Veaux, Pascal Scalart, André Gilloire:
Analysis and on-line detection of audible distortions in GSM telephony. 2579-2582 - Wen Rong Ru, Shih-Chen Lin, Po-Cheng Chen, Chun-Hung Kuo:
A parameter-based 2-talker detection apparatus for echo cancellation. 2583-2586 - Kuan-Chieh Yen, Jun Huang, Yunxin Zhao:
Co-channel speech separation in the presence of correlated and uncorrelated noises. 2587-2590
Speech and Noise 2
- David Burshtein, Sharon Gannot:
Speech enhancement using a mixture-maximum model. 2591-2594 - Joaquin Gonzalez-Rodriguez, Santiago Cruz-Llanas, Javier Ortega-Garcia:
Concurrent speakers separation through binaural processing of stereo recordings. 2595-2598 - Harald Gustafsson, Sven Nordholm, Ingvar Claesson:
Spectral subtraction with adaptive averaging of the gain function. 2599-2602 - François Gaillard, Frédéric Berthommier, Gang Feng, Jean-Luc Schwartz:
A reliability criterion for time-frequency labeling based on periodicity in an auditory scene. 2603-2606 - Serguei Koval, Mikhail Stolbov, Mikhail Khitrov:
Broadband noise cancellation systems: new approach to working performance optimization. 2607-2610 - Klaus Linhard, Tim Haulick:
Noise subtraction with parametric recursive gain curves. 2611-2614 - Enrique Masgrau, Luis Aguilar, Eduardo Lleida:
Performance comparison of several adaptive schemes for microphone array beamforming. 2615-2618 - Mitsunori Mizumachi, Masato Akagi:
An objective distortion estimator for hearing aids and its application to noise reduction. 2619-2622 - Elias Nemer, Rafik A. Goubran, Samy Mahmoud:
Speech enhancement using fourth-order cumulants and time-domain optimal filters. 2623-2626 - Philippe Renevey, Andrzej Drygajlo:
Missing feature theory and probabilistic estimation of clean speech components for robust speech recognition. 2627-2630 - Josep M. Salavedra, Xavier Bou:
Distortion effects of several cumulant-based wiener filtering algorithms. 2631-2634 - Milan Svoboda, Pavel Sovka, Petr Pollák:
Combined noise suppression system for monaural cochlear implants. 2635-2638 - Sander J. van Wijngaarden, Herman J. M. Steeneken:
Objective prediction of speech intelligibility at high ambient noise levels using the speech transmission index. 2639-2642 - Eric A. Wan, Rudolph van der Merwe:
Noise-regularized adaptive filtering for speech enhancement. 2643-2646 - F. Zarubin, Alexander Kovtonyuk, K. Zadiraka:
Speech enhancement using karhunen-lo ve transformation and wiener filtering in critical bands. 2647-2650
Spoken Dialogue Systems
- Tom Brøndsted:
The CPK NLP suite for spoken language understanding. 2651-2654 - Grace Chung, Stephanie Seneff, I. Lee Hetherington:
Towards multi-domain speech understanding using a two-stage recognizer. 2655-2658 - Ivo Ipsic, France Mihelic, Simon Dobrisek, Jerneja Gros, Nikola Pavesic:
A slovenian spoken dialog system for air flight inquiries. 2659-2662 - Ganesh N. Ramaswamy, Jan Kleindienst, Daniel M. Coffman, Ponani S. Gopalakrishnan, Chalapathy Neti:
A pervasive conversational interface for information interaction. 2663-2666 - B. Vromans, Robert J. van Vark, Bernhard Rueber, Andreas Kellner:
Extending the SUSI system with negative knowledge. 2667-2670
Speech Perception
- Olivier Crouzet, Nicole Bacri:
Phonological constraints in speech segmentation processes: investigating levels of implementation. 2671-2674 - Robert I. Damper, Steve R. Gunn:
Learning phonetic distinctions from speech signals. 2675-2678 - Elliott Moreton, Shigeaki Amano:
Phonotactics in the perception of Japanese vowel length: evidence for long-distance dependencies. 2679-2682 - Sharon Peperkamp, Emmanuel Dupoux, Núria Sebastián-Gallés:
Perception of stress by French, Spanish, and bilingual subjects. 2683-2686 - Rosaria Silipo, Steven Greenberg, Takayuki Arai:
Temporal constraints on speech intelligibility as deduced from exceedingly sparse spectral representations. 2687-2690
Corpora
- Khalid Choukri, Valérie Mapelli, Jeffrey Allen:
New developments within the european language resources association (ELRA). 2691-2694 - Maxine Eskénazi, Alexander I. Rudnicky, Karin Gregory, Paul C. Constantinides, Robert Brennan, Christina L. Bennett, Jwan Allen:
Data collection and processing in the carnegie mellon communicator. 2695-2698 - Harald Höge, Christoph Draxler, Henk van den Heuvel, Finn Tore Johansen, Eric Sanders, Herbert S. Tropf:
Speechdat multilingual speech databases for teleservices: across the finish line. 2699-2702 - Andreas Mengel, Ulrich Heid:
Enhancing reusability of speech corpora by hyperlinked query output. 2703-2706 - Kim E. A. Silverman, Victoria Anderson, Jerome R. Bellegarda, Kevin A. Lenzo, Devang Naik:
Design and ccollection of a corpus of polyphones and prosodic contexts for speech synthesis research and development. 2707-2708
Speech Recognition - Training
- Sen-Chia Chang, Shih-Chieh Chien, Woei-Chyang Shieh:
Mandarin telephone speech recognition using MCE/GPD-based speaker cluster HMM. 2709-2712 - José A. R. Fonollosa, Eloi Batlle:
Combining length restrictions and n-best techniques in multiple-pass search strategies. 2713-2716 - Cecile Gelin-Huet, Kenneth Rose, Ajit V. Rao:
The deterministic annealing approach for discriminative continuous HMM design. 2717-2720 - Qiang Huo, Bin Ma:
On-line adaptive learning of CDHMM parameters based on multiple-stream prior evolution and posterior pooling. 2721-2724 - Thomas Kemp, Alex Waibel:
Unsupervised training of a speech recognizer: recent experiments. 2725-2728 - Cristina Chesta, Pietro Laface, Mario Nigra:
Piecewise HMM discriminative training. 2729-2732 - Fabrice Lefèvre, Claude Montacié, Marie-José Caraty:
A MLE algorithm for the k-NN HMM system. 2733-2736 - John W. McDonough, William Byrne:
Single-pass adapted training with all-pass transforms. 2737-2740 - Albino Nogueiras Rodríguez, José B. Mariño:
Minimum confusibility training of context dependent demiphones. 2741-2744 - Algimantas Rudzionis, Vytautas Rudzionis:
Phoneme recognition in fixed context using regularized discriminant analysis. 2745-2748 - Dat Tran, Michael Wagner:
Hidden Markov models using fuzzy estimation. 2749-2752 - Claudio Vair, Massimiliano Mercogliano, Luciano Fissore:
Incremental training of CDHMMs using bayesian learning. 2753-2756 - Daniel Willett, Stefan Müller, Gerhard Rigoll:
A discriminative training procedure based on language model and dictionary for LVCSR. 2757-2760 - Jian Wu, Qing Guo:
A novel discriminative method for HMM in automatic speech recognition. 2761-2764
Speech Analysis and Segmentation
- J. V. Avadhanulu, M. Mathew, Thippur V. Sreenivas:
EARLYZER: perceptualy motivated robust TFR of speech. 2765-2768 - C. M. Aguilera, A. Navas, Rafael Urquiza de la Rosa, Alfonso Gago:
Frequency lowering using a discrete exponential transform. 2769-2772 - Joseph Di Martino, Yves Laprie:
An efficient F0 determination algorithm based on the implicit calculation of the autocorrelation of the temporal excitation signal. 2773-2776 - Andrew Wilson Howitt:
Vowel landmark detection. 2777-2780 - Hideki Kawahara, Haruhiro Katayose, Alain de Cheveigné, Roy D. Patterson:
Fixed point analysis of frequency to instantaneous frequency mapping for accurate estimation of F0 and periodicity. 2781-2784 - Bob Lawlor, Anthony D. Fagan:
A novel high quality efficient algorithm for time-scale modification of speech. 2785-2788 - Minkyu Lee, Jan P. H. van Santen, Bernd Möbius, Joseph P. Olive:
Formant tracking using segmental phonemic information. 2789-2792 - John McKenna, Stephen Isard:
Tailoring kalman filtering towards speaker characterisation. 2793-2796 - Ariel Salomon, Carol Y. Espy-Wilson:
Automatic detection of manner events based on temporal parameters. 2797-2800 - Mouhamadou Seck, Frédéric Bimbot, Didier Zugaj, Bernard Delyon:
Two-class signal segmentation for speech/music detection in audio tracks. 2801-2804 - Vu Ngoc Tuan, Christophe d'Alessandro:
Robust glottal closure detection using the wavelet transform. 2805-2808 - Jan P. H. van Santen, Richard Sproat:
High-accuracy automatic segmentation. 2809-2812 - Ilija Zeljkovic, Yannis Stylianou:
Single complex sinusoid and ARHE model based pitch extractors. 2813-2816
Speech and Noise 3
- Agustín Álvarez, Rafael Martínez, Pedro Gómez, Victor Nieto Lluis, M. M. Pérez:
A robust isolated word recognizer for highly non-stationary environments. recognition results. 2817-2820 - Mohamed Afify:
Sequential bias compensation for robust speech recognition. 2821-2824 - Tarcisio Coianiz, Daniele Falavigna, Roberto Gretter, Marco Orlandi:
Use of simulated data for robust telephone speech recognition. 2825-2828 - Y. Hauptman, Yuval Bistritz:
On the use of time alignments for noisy speech recognition. 2829-2832 - Juha Häkkinen, Janne Suontausta, Ramalingam Hariharan, Marcel Vasilache, Kari Laurila:
Improved feature vector normalization for noise robust connected speech recognition. 2833-2836 - Ljubomir Josifovski, Martin Cooke, Phil D. Green, Ascension Vizinho:
State based imputation of missing data for robust speech recognition and speech enhancement. 2837-2840 - Christopher Kermorvant, Andrew C. Morris:
A comparison of two strategies for ASR in additive noise: missing data and spectral subtraction. 2841-2844 - Ben Milner, Mark Farrell:
A comparison of techniques for tone compensation in payphone-based speech recognition. 2845-2848 - Xavier Menéndez-Pidal, Ruxin Chen, Duanpei Wu, Mick Tanaka:
Front-end improvements to reduce stationary & variable channel and noise distortions in continuous speech recognition tasks. 2849-2852 - George Nokas, Evangelos Dermatas:
Speech recognition in noisy reverberant rooms using a frequency domain blind deconvolution method. 2853-2856 - Volker Schless, Fritz Class, Peter Sandl:
Optimization of a speech recognizer for aircraft environments. 2857-2860 - Néstor Becerra Yoma, Lee Luan Ling, Sandra Dotto Stump:
Temporal constraints in viterbi alignment for speech recognition in noise. 2861-2864 - Kazumasa Yamamoto, Seiichi Nakagawa:
HMM composition of segmental unit input HMM for noisy speech recognition. 2865-2868 - Néstor Becerra Yoma, Lee Luan Ling, Sandra Dotto Stump:
Robust connected word speech recognition using weighted viterbi algorithm and context-dependent temporal constraints. 2869-2872 - Kaisheng Yao, Bertram E. Shi, Pascale Fung, Zhigang Cao:
Liftered forward masking procedure for robust digits recognition. 2873-2876 - Yunxin Zhao:
Channel identification and spectrum estimation for robust automatic speech recognition. 2877-2880
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.