default search action
INTERSPEECH 2000: Beijing, China
- Sixth International Conference on Spoken Language Processing, ICSLP 2000 / INTERSPEECH 2000, Beijing, China, October 16-20, 2000. ISCA 2000
Volume 1
Speech Production Control (Special Session)
- Johan Liljencrants, Gunnar Fant, Anita Kruckenberg:
Subglottal pressure and prosody in Swedish. 1-4 - Kiyoshi Honda, Shinobu Masaki, Yasuhiro Shimada:
Observation of laryngeal control for voicing and pitch change by magnetic resonance imaging technique. 5-8 - Hiroya Fujisaki, Ryou Tomana, Shuichi Narusawa, Sumio Ohno, Changfu Wang:
Physiological mechanisms for fundamental frequency control in standard Chinese. 9-12 - René Carré:
On vocal tract asymmetry/symmetry. 13-16 - Olov Engwall:
Are static MRI measurements representative of dynamic speech? results from a comparative study using MRI, EPG and EMA. 17-20 - Shinan Lu, Lin He, Yufang Yang, Jianfen Cao:
Prosodic control in Chinese TTS system. 21-24 - Yuqing Gao, Raimo Bakis, Jing Huang, Bing Xiang:
Multistage coarticulation model combining articulatory, formant and cepstral features. 25-28 - Osamu Fujimura:
Rhythmic organization and signal characteristics of speech. 29-35 - Sven E. G. Öhman:
Oral culture in the 21st century: the case of speech processing. 36-41 - Jintao Jiang, Abeer Alwan, Lynne E. Bernstein, Patricia A. Keating, Edward T. Auer:
On the correlation between facial movements, tongue movements and speech acoustics. 42-45
Linguistics, Phonology, Phonetics, and Psycholinguistics 1, 2
- Sandra P. Whiteside, E. Rixon:
Coarticulation patterns in identical twins: an acoustic case study. 46-49 - Philip Hanna, Darryl Stewart, Ji Ming, Francis Jack Smith:
Improved lexicon formation through removal of co-articulation and acoustic recognition errors. 50-53 - Anders Lindström, Anna Kasaty:
A two-level approach to the handling of foreign items in Swedish speech technology applications. 54-57 - Yasuharu Den, Herbert H. Clark:
Word repetitions in Japanese spontaneous speech. 58-61 - Allard Jongman, Corinne B. Moore:
The role of language experience in speaker and rate normalization processes. 62-65 - Achim F. Müller, Jianhua Tao, Rüdiger Hoffmann:
Data-driven importance analysis of linguistic and phonetic information. 66-69 - Hiroya Fujisaki, Katsuhiko Shirai, Shuji Doshita, Seiichi Nakagawa, Keikichi Hirose, Shuichi Itahashi, Tatsuya Kawahara, Sumio Ohno, Hideaki Kikuchi, Kenji Abe, Shinya Kiriyama:
Overview of an intelligent system for information retrieval based on human-machine dialogue through spoken language. 70-73 - Li-chiung Yang:
The expression and recognition of emotions through prosody. 74-77 - Marc Swerts, Miki Taniguchi, Yasuhiro Katagiri:
Prosodic marking of information status in tokyo Japanese. 78-81 - Britta Wrede, Gernot A. Fink, Gerhard Sagerer:
Influence of duration on static and dynamic properties of German vowels in spontaneous speech. 82-85 - Bo Zheng, Bei Wang, Yufang Yang, Shinan Lu, Jianfen Cao:
The regular accent in Chinese sentences. 86-89 - Odile Mella, Dominique Fohr, Laurent Martin, Andreas J. Carlen:
A tool for the synchronization of speech and mouth shapes: LIPS. 90-93 - Mohamed-Zakaria Kurdi:
Semantic tree unification grammar: a new formalism for spoken language processing. 94-97
Discourse and Dialogue 1, 2
- Akira Kurematsu, Yousuke Shionoya:
Identification of utterance intention in Japanese spontaneous spoken dialogue by use of prosody and keyword information. 98-101 - Sherif M. Abdou, Michael S. Scordilis:
Improved speech understanding using dialogue expectation in sentence parsing. 102-105 - Helen M. Meng, Carmen Wai, Roberto Pieraccini:
The use of belief networks for mixed-initiative dialog modeling. 106-109 - Michael F. McTear, Susan Allen, Laura Clatworthy, Noelle Ellison, Colin Lavelle, Helen McCaffery:
Integrating flexibility into a structured dialogue model: some design considerations. 110-113 - Yasuhisa Niimi, Tomoki Oku, Takuya Nishimoto, Masahiro Araki:
A task-independent dialogue controller based on the extended frame-driven method. 114-117 - Wei Xu, Alex Rudnicky:
Language modeling for dialog system. 118-121 - Kallirroi Georgila, Nikos Fakotakis, George Kokkinakis:
Building stochastic language model networks based on simultaneous word/phrase clustering. 122-125 - Li-chiung Yang, Richard Esposito:
Prosody and topic structuring in spoken dialogue. 126-129 - Stéphane H. Maes:
Elements of conversational computing - a paradigm shift. 130-133 - Ludek Müller, Filip Jurcícek, Lubos Smídl:
Rejection and key-phrase spottin techniques using a mumble model in a czech telephone dialog system. 134-137 - Tim Paek, Eric Horvitz, Eric K. Ringger:
Continuous listening for unconstrained spoken dialog. 138-141 - Stefanie Shriver, Alan W. Black, Ronald Rosenfeld:
Audio signals in speech interfaces. 142-145 - Péter Pál Boda:
Visualisation of spoken dialogues. 146-149 - Mary Zajicek:
The construction of speech output to support elderly visually impaired users starting to use the internet. 150-153
Recognition and Understanding of Spoken Language 1, 2
- Kazuyuki Takagi, Rei Oguro, Kazuhiko Ozeki:
Effects of word string language models on noisy broadcast news speech recognition. 154-157 - Xiaoqiang Luo, Martin Franz:
Semantic tokenization of verbalized numbers in language modeling. 158-161 - Kazuomi Kato, Hiroaki Nanjo, Tatsuya Kawahara:
Automatic transcription of lecture speech using topic-independent language modeling. 162-165 - Rocio Guillén, Randal Erman:
Extending grammars based on similar-word recognition. 166-169 - Edward W. D. Whittaker, Philip C. Woodland:
Particle-based language modelling. 170-173 - Wing Nin Choi, Yiu Wing Wong, Tan Lee, P. C. Ching:
Lexical tree decoding with a class-based language model for Chinese speech recognition. 174-177 - Karthik Visweswariah, Harry Printz, Michael Picheny:
Impact of bucketing on performance of linearly interpolated language models. 178-181 - Shuwu Zhang, Hirofumi Yamamoto, Yoshinori Sagisaka:
An embedded knowledge integration for hybrid language modelling. 182-195 - Lucian Galescu, James F. Allen:
Hierarchical statistical language models: experiments on in-domain adaptation. 186-189 - Hirofumi Yamamoto, Kouichi Tanigaki, Yoshinori Sagisaka:
A language model for conversational speech recognition using information designed for speech translation. 190-193 - Bob Carpenter, Sol Lerner, Roberto Pieraccini:
Optimizing BNF grammars through source transformations. 194-197 - Jian Wu, Fang Zheng:
On enhancing katz-smoothing based back-off language model. 198-201 - Wei Xu, Alex Rudnicky:
Can artificial neural networks learn language models? 202-205 - Guergana Savova, Michael Schonwetter, Sergey V. Pakhomov:
Improving language model perplexity and recognition accuracy for medical dictations via within-domain interpolation with literal and semi-literal corpora. 206-209 - Karl Weilhammer, Günther Ruske:
Placing structuring elements in a word sequence for generating new statistical language models. 210-213 - Yannick Estève, Frédéric Béchet, Renato de Mori:
Dynamic selection of language models in a dialogue system. 214-217 - Magne Hallstein Johnsen, Trym Holter, Torbjørn Svendsen, Erik Harborg:
Stochastic modeling of semantic content for use IN a spoken dialogue system. 218-221 - Tomio Takara, Eiji Nagaki:
Spoken word recognition using the artificial evolution of a set of vocabulary. 222-225 - Eric Horvitz, Tim Paek:
Deeplistener: harnessing expected utility to guide clarification dialog in spoken language systems. 226-229 - Yunbin Deng, Bo Xu, Taiyi Huang:
Chinese spoken language understanding across domain. 230-233 - Sven C. Martin, Andreas Kellner, Thomas Portele:
Interpolation of stochastic grammar and word bigram models in natural language understanding. 234-237 - Satoru Kogure, Seiichi Nakagawa:
A portable development tool for spoken dialogue systems. 238-241 - Yi-Chung Lin, Huei-Ming Wang:
Error-tolerant language understanding for spoken dialogue systems. 242-245 - Akinori Ito, Chiori Hori, Masaharu Katoh, Masaki Kohda:
Language modeling by stochastic dependency grammar for Japanese speech recognition. 246-249 - Ruiqiang Zhang, Ezra Black, Andrew M. Finch, Yoshinori Sagisaka:
A tagger-aided language model with a stack decoder. 250-253 - Julia Hirschberg, Diane J. Litman, Marc Swerts:
Generalizing prosodic prediction of speech recognition errors. 254-257 - Jerome R. Bellegarda, Kim E. A. Silverman:
Toward unconstrained command and control: data-driven semantic inference. 258-261 - Ken Hanazawa, Shinsuke Sakai:
Continuous speech recognition with parse filtering. 262-265 - Martine Adda-Decker, Gilles Adda, Lori Lamel:
Investigating text normalization and pronunciation variants for German broadcast transcription. 266-269 - Mirjam Wester, Eric Fosler-Lussier:
A comparison of data-derived and knowledge-based modeling of pronunciation variation. 270-273 - Judith M. Kessens, Helmer Strik, Catia Cucchiarini:
A bottom-up method for obtaining information about pronunciation variation. 274-277 - Jiyong Zhang, Fang Zheng, Mingxing Xu, Ditang Fang:
Semi-continuous segmental probability modeling for continuous speech recognition. 278-281 - Christos Andrea Antoniou, T. Jeff Reynolds:
Acoustic modelling using modular/ensemble combinations of heterogeneous neural networks. 282-285 - Hsiao-Wuen Hon, Shankar Kumar, Kuansan Wang:
Unifying HMM and phone-pair segment models. 286-289 - Ming Li, Tiecheng Yu:
Multi-group mixture weight HMM. 290-292 - Tetsuro Kitazoe, Tomoyuki Ichiki, Makoto Funamori:
Application of pattern recognition neural network model to hearing system for continuous speech. 293-296 - Nathan Smith, Mahesan Niranjan:
Data-dependent kernels in svm classification of speech patterns. 297-300 - Srinivasan Umesh, Richard C. Rose, Sarangarajan Parthasarathy:
Exploiting frequency-scaling invariance properties of the scale transform for automatic speech recognition. 301-304 - Masahiro Fujimoto, Jun Ogata, Yasuo Ariki:
Large vocabulary continuous speech recognition under real environments using adaptive sub-band spectral subtraction. 305-308 - Liang Gu, Kenneth Rose:
Perceptual harmonic cepstral coefficients as the front-end for speech recognition. 309-312 - Yik-Cheung Tam, Brian Kan-Wing Mak:
Optimization of sub-band weights using simulated noisy speech in multi-band speech recognition. 313-316 - Robert Faltlhauser, Thilo Pfau, Günther Ruske:
On the use of speaking rate as a generalized feature to improve decision trees. 317-320 - Jun Toyama, Masaru Shimbo:
Syllable recognition using glides based on a non-linear transformation. 321-324 - M. Kemal Sönmez, Madelaine Plauché, Elizabeth Shriberg, Horacio Franco:
Consonant discrimination in elicited and spontaneous speech: a case for signal-adaptive front ends in ASR. 325-328 - Khalid Daoudi, Dominique Fohr, Christophe Antoine:
A new approach for multi-band speech recognition based on probabilistic graphical models. 329-332 - Hervé Glotin, Frédéric Berthommier:
Test of several external posterior weighting functions for multiband full combination ASR. 333-336 - Kenji Okada, Takayuki Arai, Noburu Kanederu, Yasunori Momomura, Yuji Murahara:
Using the modulation wavelet transform for feature extraction in automatic speech recognition. 337-340 - Qifeng Zhu, Abeer Alwan:
AM-demodulation of speech spectra and its application io noise robust speech recognition. 341-344 - Astrid Hagen, Andrew C. Morris:
Comparison of HMM experts with MLP experts in the full combination multi-band approach to robust ASR. 345-348 - Astrid Hagen, Hervé Bourlard:
Using multiple time scales in the framework of multi-stream speech recognition. 349-352 - Hua Yu, Alex Waibel:
Streamlining the front end of a speech recognizer. 353-356 - Bhiksha Raj, Michael L. Seltzer, Richard M. Stern:
Reconstruction of damaged spectrographic features for robust speech recognition. 357-360 - Janienke Sturm, Hans Kamperman, Lou Boves, Els den Os:
Impact of speaking style and speaking task on acoustic models. 361-364 - Shubha Kadambe, Ron Burns:
Encoded speech recognition accuracy improvement in adverse environments by enhancing formant spectral bands. 365-368 - Jon Barker, Ljubomir Josifovski, Martin Cooke, Phil D. Green:
Soft decisions in missing data techniques for robust automatic speech recognition. 373-376 - Jian Liu, Tiecheng Yu:
New tone recognition methods for Chinese continuous speech. 377-380 - Bo Zhang, Gang Peng, William S.-Y. Wang:
Reliable bands guided similarity measure for noise-robust speech recognition. 381-384 - Tsuneo Nitta, Masashi Takigawa, Takashi Fukuda:
A novel feature extraction using multiple acoustic feature planes for HMM-based speech recognition. 385-388 - Fang Zheng, Guoliang Zhang:
Integrating the energy information into MFCC. 389-392 - Omar Farooq, Sekharjit Datta:
Speaker independent phoneme recognition by MLP using wavelet features. 393-396 - Laurent Couvreur, Christophe Couvreur, Christophe Ris:
A corpus-based approach for robust ASR in reverberant environments. 397-400 - Issam Bazzi, James R. Glass:
Modeling out-of-vocabulary words for robust speech recognition. 401-404 - Bojana Gajic, Richard C. Rose:
Hidden Markov model environmental compensation for automatic speech recognition on hand-held mobile devices. 405-408 - Andrew C. Morris, Ljubomir Josifovski, Hervé Bourlard, Martin Cooke, Phil D. Green:
A neural network for classification with incomplete data: application to robust ASR. 409-412 - Shigeki Matsuda, Mitsuru Nakai, Hiroshi Shimodaira, Shigeki Sagayama:
Feature-dependent allophone clustering. 413-416 - Qian Yang, Jean-Pierre Martens:
Data-driven lexical modeling of pronunciation variations for ASR. 417-420 - Dat Tran, Michael Wagner:
Fuzzy entropy hidden Markov models for speech recognition. 421-424 - Carl Quillen:
Adjacent node continuous-state HMM's. 425-428 - Janienke Sturm, Eric Sanders:
Modelling phonetic context using head-body-tail models for connected digit recognition. 429-432 - Issam Bazzi, Dina Katabi:
Using support vector machines for spoken digit recognition. 433-436 - Jiping Sun, Xing Jing, Li Deng:
Data-driven model construction for continuous speech recognition using overlapping articulatory features. 437-440 - Marcel Vasilache:
Speech recognition using HMMs with quantized parameters. 441-444 - Yingyong Qi, Jack Xin:
A perception and PDE based nonlinear transformation for processing spoken words. 445-448 - Reinhard Blasig, Georg Rose, Carsten Meyer:
Training of isolated word recognizers with continuous speech. 449-452
Production of Spoken Language
- Shu-Chuan Tseng:
Repair patterns in spontaneous Chinese dialogs: morphemes, words, and phrases. 453-456 - Jianwu Dang, Kiyoshi Honda:
Improvement of a physiological articulatory model for synthesis of vowel sequences. 457-460 - Kunitoshi Motoki, Xavier Pelorson, Pierre Badin, Hiroki Matsuzaki:
Computation of 3-d vocal tract acoustics based on mode-matching technique. 461-464 - Lucie Ménard, Louis-Jean Boë:
Exploring vowel production strategies from infant to adult by means of articulatory inversion of formant data. 465-468 - Gavin Smith, Tony Robinson:
Segmentation of a speech waveform according to glottal open and closed phases using an autoregressive-HMM. 469-472 - Rosemary Orr, Bert Cranen, Felix de Jong, Lou Boves:
Comparison of inverse filtering of the flow signal and microphone signal. 473-476 - Markus Iseli, Abeer Alwan:
Inter- and intra-speaker variability of glottal flow derivative using the LF model. 477-480
Linguistics, Phonology, Phonetics, and Psycholinguistics 3
- Philippe Blache, Daniel Hirst:
Multi-level annotation for spoken language corpora. 481-484 - Aijun Li, Fang Zheng, William Byrne, Pascale Fung, Terri Kamm, Yi Liu, Zhanjiang Song, Umar Ruhi, Veera Venkataramani, Xiaoxia Chen:
CASS: a phonetically transcribed corpus of mandarin spontaneous speech. 485-488 - Kazuhide Yamamoto, Eiichiro Sumita:
Multiple decision-tree strategy for input-error robustness: a simulation of tree combinations. 489-492 - Zheng Chen, Kai-Fu Lee, Mingjing Li:
Discriminative training on language model. 493-496 - Jianfeng Gao, Mingjing Li, Kai-Fu Lee:
N-gram distribution based language model adaptation. 497-500 - Francisco Palou, Paolo Bravetti, Ossama Emam, Volker Fischer, Eric Janke:
Towards a common phone alphabet for multilingual speech recognition. 501-504 - Robert S. Belvin, Ron Burns, Cheryl Hein:
What²s next: a case study in the multidimensionality of a dialog system. 504-507
Dialogue Systems and Speech Input
- Masanobu Higashida, Kumiko Ohmori:
A new dialogue control method based on human listening process to construct an interface for ascertaining a user²s inputs. 508-511 - Xianfang Wang, Limin Du:
Spoken language understanding in a Chinese spoken dialogue system engine. 512-515 - Satya Dharanipragada, Martin Franz, J. Scott McCarley, Kishore Papineni, Salim Roukos, Todd Ward, Wei-Jing Zhu:
Statistical methods for topic segmentation. 516-519 - Berlin Chen, Hsin-Min Wang, Lin-Shan Lee:
Retrieval of mandarin broadcast news using spoken queries. 520-523 - John H. L. Hansen, Jay P. Plucienkowski, Stephen Gallant, Bryan L. Pellom, Wayne H. Ward:
"CU-move": robust speech processing for in-vehicle speech systems. 524-527 - Ji-Hwan Kim, Philip C. Woodland:
A rule-based named entity recognition system for speech input. 528-531 - Mohammad Reza Sadigh, Hamid Sheikhzadeh, Mohammad Reza Jahangir, Arash Farzan:
A rule-based approach to farsi language text-to-phoneme conversion. 532-535 - Allard Jongman, Yue Wang, Joan A. Sereno:
Acoustic and perceptual properties of English fricatives. 536-539 - Stefanie Shattuck-Hufnagel, Nanette Veilleux:
The special phonological characteristics of monosyllabic function words in English. 540-543 - Karmele López de Ipiña, Inés Torres, Lourdes Oñederra, Amparo Varona, Luis Javier Rodríguez:
Selection of sublexical units for continuous speech recognition of basque. 544-547 - Madelaine Plauché, M. Kemal Sönmez:
Machine learning techniques for the identification of cues for stop place. 548-551 - Christina Widera:
Strategies of vowel reduction - a speaker-dependent phenomenon. 552-555 - Michelle A. Fox:
Syllable-final /s/ lenition in the LDC's callhome Spanish corpus. 556-559 - Akira Kurematsu, Takeaki Nakazaki:
Meaning extraction based on frame representation for Japanese spoken dialogue. 560-563 - Johanneke Caspers:
Pitch accents, boundary tones and turn-taking in dutch map task dialogues. 565-568 - Yoichi Yamashita, Michiyo Murai:
An annotation scheme of spoken dialogues with topic break indexes. 569-572 - Nanette Veilleux:
Application of the centering framework in spontaneous dialogues. 573-576 - Hiroki Mori, Hideki Kasuya:
Automatic lexicon generation and dialogue modeling for spontaneous speech. 577-580 - Maria Wolters, Hansjörg Mixdorff:
Evaluating radio news intonation - autosegmental versus superpositional modelling. 581-584 - Daniele Falavigna, Roberto Gretter, Marco Orlandi:
A mixed language model for a dialogue system over ihe telephone. 585-588 - Linda Bell, Joakim Gustafson:
Positive and negative user feedback in a spoken dialogue corpus. 589-592 - Anne Cutler, Mariëtte Koster:
Stress and lexical activation in dutch. 593-596 - Safa Nasser Eldin, Hanna Abdel Nour, Rajouani Abdenbi:
Automatic modeling and implementation of intonation for the arabic language in TTS systems. 597-600 - Venkata Ramana Rao Gadde:
Modeling word durations. 601-604 - Jennifer J. Venditti, Jan P. H. van Santen:
Japanese intonation synthesis using superposition and linear alignment models. 605-608 - Toshimitsu Minowa, Ryo Mochizuki, Hirofumi Nishimura:
Improving the naturalness of synthetic speech by utilizing the prosody of natural speech. 609-612 - Sin-Horng Chen, Chen-Chung Ho:
A hybrid statistical/RNN approach to prosody synthesis for taiwanese TTS. 613-616 - Nobuaki Minematsu, Yukiko Fujisawa, Seiichi Nakagawa:
Performance comparison among HMM, DTW, and human abilities in terms of identifying stress patterns of word utterances. 617-620 - Juan Manuel Montero, Ricardo de Córdoba, José A. Vallejo, Juana M. Gutiérrez-Arriola, Emilia Enríquez, José Manuel Pardo:
Restricted-domain female-voice synthesis in Spanish: from database design to ANN prosodic modeling. 621-624 - Xavier Fernández Salgado, Eduardo Rodríguez Banga:
A hierarchical intonation model for synthesising F0 contours in galician language. 625-628 - Ted H. Applebaum, Nick Kibre, Steve Pearson:
Features for F0 contour prediction. 629-632 - Zhenglai Gu, Hiroki Mori, Hideki Kasuya:
Prosodic variation of focused syllables of disyllabic word in Mandarin Chinese. 633-636 - Stephen M. Chu, Thomas S. Huang:
Automatic head gesture learning and synthesis from prosodic cues. 637-640 - Martti Vainio, Toomas Altosaar, Stefan Werner:
Measuring the importance of morphological information for finnish speech synthesis. 641-644 - Oliver Jokisch, Hansjörg Mixdorff, Hans Kruschke, Ulrich Kordon:
Learning the parameters of quantitative prosody models. 645-648 - Shuichi Narusawa, Hiroya Fujisaki, Sumio Ohno:
A method for automatic extraction of parameters of the fundamental frequency contour. 649-652 - Tetsuro Kitazoe, Sung-Ill Kim, Yasunari Yoshitomi, Tatsuhiko Ikeda:
Recognition of emotional states using voice, face image and thermal image of face. 653-656 - Keiko Watanuki, Susumu Seki, Hideo Miyoshi:
Turn taking and multimodal information in two-people dialog. 657-660 - Hamid Reza Abutalebi, Mahmood Bijankhan:
Implementation of a text-to-speech system for farsi language. 661-664 - Richard Huber, Anton Batliner, Jan Buckow, Elmar Nöth, Volker Warnke, Heinrich Niemann:
Recognition of emotion in a realistic dialogue scenario. 665-668 - Johanna Barry, Peter J. Blamey, Kathy Lee, Dilys Cheung:
Differentiation in tone production in cantonese-speaking hearing-impaired children. 669-672 - Martine van Zundert, Jacques M. B. Terken:
Learning effects for phonetic properties of synthetic speech. 673-676 - Laura Mayfield Tomokiyo, Le Wang, Maxine Eskénazi:
An empirical study of the effectiveness of speech-recognition-based pronunciation training. 677-680 - Olivier Deroo, Christophe Ris, Sofie Gielen, Johan Vanparys:
Automatic detection of mispronounced phonemes for language learning tools. 681-684 - Horacio Meza Escalona, Ingrid Kirschning, Ofelia Cervantes Villagómez:
Estimation of duration models for phonemes in m exican speech synthesis. 685-688 - Xiaoru Wu, Ren-Hua Wang, Guoping Hu:
Special text processing based external descriptor rule. 689-692 - Zhenli Yu, Shangcui Zeng:
Articulatory synthesis using a vocal-tract model of variable length. 693-696 - Philippe Boula de Mareüil:
Linguistic-prosodic processing for text-to-speech synthesis in italian. 697-700 - Matthias Eichner, Matthias Wolff, Rüdiger Hoffmann:
A unified approach for speech synthesis and speech recognition using stochastic Markov graphs. 701-704 - Andrew P. Breen, James Salter:
Using F0 within a phonologically motivated method of unit selection. 705-708 - Christophe Blouin, Paul C. Bagshaw:
Analysis of the degradation of French vowels induced by the TD-PSOLA algorithm, in text-to-speech context. 709-712 - Artur Janicki:
Automatic construction of acoustic inventory for the concatenative speech synthesis for polish. 713-716 - Diane Hirschfeld, Matthias Wolff:
Universal and multilingual unit selection for DRESS. 717-720 - Davis Pan, Brian Heng, Shiufun Cheung, Ed Chang:
Improving speech synthesis for high intelligibility under adverse conditions. 721-724 - Nobuyuki Nishizawa, Nobuaki Minematsu, Keikichi Hirose:
Development of a formant-based analysis-synthesis system and generation of high quality liquid sounds of Japanese. 725-728 - Oliver Jokisch, Matthias Eichner:
Synthesizing and evaluating an artificial language: klingon. 729-732 - Craig Olinsky, Alan W. Black:
Non-standard word and homograph resolution for asian language text analysis. 733-736 - Zhang Sen, Katsuhiko Shirai:
Re-estimation of LPC coefficients in the sense of l∞ criterion. 737-740 - Sung-Kyo Jung, Yong-Soo Choi, Young-Cheol Park, Dae Hee Youn:
An efficient codebook search algorithm for EVRC. 741-744 - Jong Kuk Kim, Jeong-Jin Kim, Myung-Jin Bae:
The reduction of the search time by the pre-determination of the grid bit in the g.723.1 MP-MLQ. 745-749 - Sebastian Möller, Hervé Bourlard:
Real-time telephone transmission simulation for speech recognizer and dialogue system evaluation and improvement. 750-753 - Rathinavelu Chengalvarayan, David L. Thomson:
HMM-based echo and announcement modeling approaches for noise suppression avoiding the problem of false triggers. 754-757 - Fangxin Chen:
Speaker information enhancement. 758-761 - Hans Dolfing:
Exhaustive search for lower-bound error-rates in vocal tract length normalization. 762-765 - Dusan Macho, Climent Nadeu:
Use of voicing information to improve the robustness of the spectral parameter set. 766-769 - Kaisheng Yao, Bertram E. Shi, Satoshi Nakamura, Zhigang Cao:
Residual noise compensation by a sequential EM algorithm for robust speech recognition in nonstationary noise. 770-773 - Hui Ye, Pascale Fung, Taiyi Huang:
Principal mixture speaker adaptation for improved continuous speech recognition. 774-777 - Toomas Altosaar, Martti Vainio:
Reduced impedance mismatch in speech database access. 778-781 - Jiapeng Tian, Jouji Miwa:
Internet training system for listening and pronunciation of Chinese stop consonants. 782-785 - Carlos Toshinori Ishi, Keikichi Hirose, Nobuaki Minematsu:
Identification of Japanese double-mora phonemes considering speaking rate for the use in CALL systems. 786-790
Volume 2
Speech Perception, Comprehension, and Production (Special Session)
- Roy D. Patterson, Stefan Uppenkamp, Dennis Norris, William D. Marslen-Wilson, Ingrid S. Johnsrude, Emma Williams:
Phonological processing in the auditory system: a new class of stimuli and advances in fmri techniques. 1-4 - Itaru F. Tatsumi, Michio Senda, Kenji Ishii, Masahiro Mishina, Masashi Oyama, Hinako Toyama, Keiichi Oda, Masayuki Tanaka, Yasuyuki Gondo:
Brain regions responsible for word retrieval, speech production and deficient word fluency in elderly people: a PET activation study. 5-10 - Paavo Alku, Hannu Tiitinen, Kalle J. Palomäki, Päivi Sivonen:
MEG-measurements of brain activity reveal the link between human speech production and perception. 11-14 - Karalyn Patterson, Matthew A. Lambon Ralph, Helen Bird, John R. Hodges, James L. McClelland:
Normal and impaired processing in quasi-regular domains of language: the case of English past-tense verbs. 15-19 - Nadine Martin, Eleanor M. Saffran, Gary S. Dell, Myrna F. Schwartz, Prahlad Gupta:
Neuropsychological and computational evidence for a model of lexical processing, verbal short-term memory and learning. 20-25 - Takao Fushimi, Mutsuo Ijuin, Naoko Sakuma, Masayuki Tanaka, Tadahisa Kondo, Shigeaki Amano, Karalyn Patterson, Itaru F. Tatsumi:
Normal and impaired reading of Japanese kanji and kana. 26-31 - Mutsuo Ijuin, Takao Fushimi, Karalyn Patterson, Naoko Sakuma, Masayuki Tanaka, Itaru F. Tatsumi, Tadahisa Kondo, Shigeaki Amano:
A connectionist approach to naming disorders of Japanese in dyslexic patients. 32-37 - Taeko Nakayama Wydell, Takako Shinkai:
Impaired pronunciations of kanji words by Japanese CVA patients. 38-41 - Akira Uno, M. Kaneko, N. Haruhara, M. Kaga:
Disability of phonological versus visual information processes in Japanese dyslexic children. 42-44 - Xiaolin Zhou, Yanxuan Qu:
Lexical tone in the spoken word recognition of Chinese. 45-50
Prosody 1, 2
- Xiaolin Zhou, Jie Zhuang:
Lexical tone in the speech production of Chinese words. 51-54 - Yu Hu, Qingfeng Liu, Ren-Hua Wang:
Prosody generation in Chinese synthesis using the template of quantified prosodic unit and base intonation contour. 55-58 - Yiqiang Chen, Wen Gao, Tingshao Zhu, Jiyong Ma:
Multi-strategy data mining on Mandarin prosodic patterns. 59-62 - Werner Verhelst, Dirk Van Compernolle, Patrick Wambacq:
A unified view on synchronized overlap-add methods for prosodic modifications of speech. 63-66 - Chilin Shih, Greg Kochanski:
Chinese tone modeling with stem-ML. 67-70 - Colin W. Wightman, Ann K. Syrdal, Georg Stemmer, Alistair Conkie, Mark C. Beutnagel:
Perceptually based automatic prosody labeling and prosodically enriched unit selection improve concatenative text-to-speech synthesis. 71-74 - Achim F. Müller, Jianhua Tao, Rüdiger Hoffmann:
Data-driven importance analysis of linguistic and phonetic information. 75-78 - Zhiqiang Li, Degif Petros Banksira:
Tonal structure of yes-no question intonation in chaha. 79-82 - Chao Wang, Stephanie Seneff:
Improved tone recognition by normalizing for coarticulation and intonation effects. 83-86 - Jinsong Zhang, Satoshi Nakamura, Keikichi Hirose:
Discriminating Chinese lexical tones by anchoring F0 features. 87-90 - Carlos Gussenhoven, Aoju Chen:
Universal and language-specific effects in the perception of question intonation. 91-94 - Chiu-yu Tseng, Da-De Chen:
The interplay and interaction between prosody and syntax: evidence from Mandarin Chinese. 95-97 - Hansjörg Mixdorff, Hiroya Fujisaki:
A quantitative description of German prosody offering symbolic labels as a by-product. 98-101
Speech Interface and Dialogue Systems
- Roni Rosenfeld, Xiaojin Zhu, Arthur R. Toth, Stefanie Shriver, Kevin A. Lenzo, Alan W. Black:
Towards a universal speech interface. 102-105 - Dale Russell:
A domain model centered approach to spoken language dialog systems. 106-109 - Georges Fafiotte, Jianshe Zhai:
From multilingual multimodal spoken language acquisition towards on-line assistance to intermittent human interpreting: SIM*, a versatile environment for SLP. 110-113 - Matthias Denecke:
Informational characterization of dialogue states. 114-117 - Kenji Abe, Kazushige Kurokawa, Kazunari Taketa, Sumio Ohno, Hiroya Fujisaki:
A new method for dialogue management in an intelligent system for information retrieval. 118-121 - Esther Levin, Shrikanth S. Narayanan, Roberto Pieraccini, Konstantin Biatov, Enrico Bocchieri, Giuseppe Di Fabbrizio, Wieland Eckert, Sungbok Lee, A. Pokrovsky, Mazin G. Rahim, P. Ruscitti, Marilyn A. Walker:
The AT&t-DARPA communicator mixed-initiative spoken dialog system. 122-125
Multimodal, Translingual, and Dialogue Systems
- Srinivas Bangalore, Michael Johnston:
Integrating multimodal language processing with speech recognition. 126-129 - Alexander I. Rudnicky, Christina L. Bennett, Alan W. Black, Ananlada Chotimongkol, Kevin A. Lenzo, Alice Oh, Rita Singh:
Task and domain specific modelling in the Carnegie Mellon communicator system. 130-134 - Joakim Gustafson, Linda Bell, Jonas Beskow, Johan Boye, Rolf Carlson, Jens Edlund, Björn Granström, David House, Mats Wirén:
Adapt - a multimodal conversational dialogue system in an apartment domain. 134-137 - Kuansan Wang:
Implementation of a multimodal dialog system using extended markup languages. 138-141 - Stephanie Seneff, Chian Chuu, D. Scott Cyphers:
ORION: from on-line interaction to off-line delegation. 142-145 - Lei Duan, Alexander Franz, Keiko Horiguchi:
Practical spoken language translation using compiled feature structure grammars. 146-149 - Helen M. Meng, Shuk Fong Chan, Yee Fong Wong, Tien Ying Fung, Wai Ching Tsui, Tin Hang Lo, Cheong Chat Chan, Ke Chen, Lan Wang, Ting-Yao Wu, Xiaolong Li, Tan Lee, Wing Nin Choi, Yiu Wing Wong, P. C. Ching, Huisheng Chi:
ISIS: A multilingual spoken dialog system developed with CORBA and KQML agents. 150-153 - Jun-ichi Hirasawa, Noboru Miyazaki, Mikio Nakano, Kiyoaki Aikawa:
New feature parameters for detecting misunderstandings in a spoken dialogue system. 154-157
Production of Spoken Language (Poster)
- Parham Mokhtari, Frantz Clermont, Kazuyo Tanaka:
Toward an acoustic-articulatory model of inter-speaker variability. 158-161 - Pascal Perrier, Joseph S. Perkell, Yohan Payan, Majid Zandipour, Frank H. Guenther, Ali Khalighi:
Degrees of freedom of tongue movements in speech may be constrained by biomechanics. 162-165 - Béatrice Vaxelaire, Rudolph Sock, Pascal Perrier:
Gestural overlap, place of articulation and speech rate - an x-ray investigation. 166-169 - Masaaki Honda, Akinori Fujino:
Articulatory compensation and adaptation for unexpected palate shape perturbation. 170-173 - Takuya Niikawa, Masafumi Matsumura, Takashi Tachimura, Takeshi Wada:
Modeling of a speech production system based on MRI measurement of three-dimensional vocal tract shapes during fricative consonant phonation. 174-177 - Slim Ouni, Yves Laprie:
Improving acoustic-to-articulatory inversion by using hypercube codebooks. 178-181 - Wael Hamza, Mohsen A. Rashwan:
Concatenative arabic speech synthesis using large speech database. 182-185 - Dong Chen, Jingming Kuang, Yan Zhang:
A new speech classifier based on Yinyang compensatory soft computing theory. 186-189 - Sebastian Möller, Ute Jekosch, Alexander Raake:
New models predicting conversational effects of telephone transmission on speech communication quality. 190-193 - Jinyu Li, Xin Luo, Ren-Hua Wang:
A novel search algorithm for LSF VQ. 194-197 - Stéphane H. Maes, Dan Chazan, Gilad Cohen, Ron Hoory:
Conversational networking: conversational protocols for transport, coding, and control. 198-201 - Hiroshi Ohmura, Akira Sasou, Kazuyo Tanaka:
A low bit rate speech coding method using a formant-articulatory parameter nomogram. 202-205 - Ning Li, Derek J. Molyneux, Meau Shin Ho, Barry M. G. Cheetham:
Variable bit-rate sinusoidal transform coding using variable order spectral estimation. 206-209 - Yong-Soo Choi, Seung-Kyun Ryu, Young-Cheol Park, Dae Hee Youn:
Efficient harmonic-CELP based hybrid coding of speech at low bit rates. 210-213 - Jesper Jensen, John H. L. Hansen:
Speech enhancement based on a constrained sinusoidal model. 214-217 - Sang-Wook Park, Seung-Kyun Ryu, Young-Cheol Park, Dae Hee Youn:
A bark coherence function for perceived speech quality estimation. 218-221 - Jinyu Kiang, Kun Deng, Ronghuai Huang:
A high-efficiency scheme for secure speech transmission using spatiotemporal chaos synchronization. 222-225
Speaker, Dialect, and Language Recognition (Poster)
- Leandro Rodríguez Liñares, Carmen García-Mateo:
Application of speaker authentication technology to a telephone dialogue system. 226-229 - Michel Dutat, Ivan Magrin-Chagnolleau, Frédéric Bimbot:
Language recognition using time-frequency principal component analysis and acoustic modeling. 230-233 - Chularat Tanprasert, Varin Achariyakulporn:
Comparative study of GMM, DTW, and ANN on Thai speaker identification system. 234-237 - Ludwig Schwardt, Johan A. du Preez:
Efficient mixed-order hidden Markov model inference. 238-241 - Olivier Thyes, Roland Kuhn, Patrick Nguyen, Jean-Claude Junqua:
Speaker identification and verification using eigenvoices. 242-245 - Arun C. Surendran, Chin-Hui Lee:
A priori threshold selection for fixed vocabulary speaker verification systems. 246-249 - Qin Jin, Alex Waibel:
Application of LDA to speaker recognition. 250-253 - Ludwig Schwardt, Johan A. du Preez:
Automatic language identification using mixed-order HMMs and untranscribed corpora. 254-257 - Johan Lindberg, Mats Blomberg:
On the potential threat of using large speech corpora for impostor selection in speaker verification. 258-261 - Javier Ortega-Garcia, Joaquin Gonzalez-Rodriguez, Daniel Tapias Merino:
Phonetic consistency in Spanish for pin-based speaker verification system. 262-265 - Zhimin Liu, Xihong Wu, Bin Zhen, Huisheng Chi:
An auditory feature extraction method based on forward-masking and its application in robust speaker identification and speech recognition. 266-269 - S. Douglas Peters, Matthieu Hébert, Daniel Boies:
Transition-oriented hidden Markov models for speaker verification. 270-273 - Pang Kuen Tsoi, Pascale Fung:
An LLR-based technique for frame selection for GMM-based text-independent speaker identification. 274-277 - Jiyong Ma, Wen Gao:
Robust speaker recognition based on high order cumulant. 278-281 - Luo Si, Qixiu Hu:
Two-stage speaker identification system based on VQ and NBDGMM. 282-285 - Johnny Mariéthoz, Johan Lindberg, Frédéric Bimbot:
A MAP approach, with synchronous decoding and unit-based normalization for text-dependent speaker verification. 286-289 - Zhibin Pan, Koji Kotani, Tadahiro Ohmi:
A fast search method of speaker identification for large population using pre-selection and hierarchical matching. 290-293 - Lan Wang, Ke Chen, Huisheng Chi:
Optimal fusion of diverse feature sets for speaker identification: an alternative method. 294-297 - Upendra V. Chaudhari, Jirí Navrátil, Stéphane H. Maes, Ramesh A. Gopinath:
Transformation enhanced multi-grained modeling for text-independent speaker recognition. 298-301 - Takashi Masuko, Keiichi Tokuda, Takao Kobayashi:
Imposture using synthetic speech against speaker verification based on spectrum and pitch. 302-305 - Shahla Parveen, Abdul Qadeer, Phil D. Green:
Speaker recognition with recurrent neural networks. 306-309 - Yoshiroh Itoh, Jun Toyama, Masaru Shimbo:
Speaker feature extraction from pitch information based on spectral subtraction for speaker identification. 310-313 - Wei-Ho Tsai, Chiwei Che, Wen-Whei Chang:
Text-independent speaker identification using Gaussian mixture bigram models. 314-317 - Hassan Ezzaidi, Jean Rouat:
Comparison of MFCC and pitch synchronous AM, FM parameters for speaker identification. 318-321 - Marcos Faúndez-Zanuy, Adam Slupinski:
Speaker verification in mismatch training and testing conditions. 322-325 - Toshiaki Uchibe, Shingo Kuroiwa, Norio Higuchi:
Determination of threshold for speaker verification using speaker adaptation gain in likelihood during training. 326-329 - Mingkuan Liu, Bo Xu:
Accent-specific Mandarin adaptation based on pronunciation modeling technology. 330-333
Prosody and Paralinguistics (Special Session)
- Hyun-Bok Lee:
In search of paralinguistic features. 334-340 - Gunnar Fant, Anita Kruckenberg:
A prominence based model of Swedish intonation. 341-344 - Hideki Kasuya, Masanori Yoshizawa, Kikuo Maekawa:
Roles of voice source dynamics as a conveyer of paralinguistic features. 345-348 - Kikuo Maekawa, Takayuki Kagomiya:
Influence of paralinguistic information on segmental articulation. 349-352 - Sumio Ohno, Yoshimitsu Sugiyama, Hiroya Fujisaki:
Analysis and modeling of the effect of paralinguistic information upon the local speech rate. 353-356 - Jianfen Cao:
Rhythm of spoken Chinese - linguistic and paralinguistic evidences -. 357-360 - Sanae Eda:
Identification and discrimination of syntactically and pragmatically contrasting intonation patterns by native and non-native speakers of standard Japanese. 361-364 - Donna Erickson, Arthur Abramson, Kikuo Maekawa, Tokihiko Kaburagi:
Articulatory characteristics of emotional utterances in spoken English. 365-368 - Keikichi Hirose, Nobuaki Minematsu, Hiromichi Kawanami:
Analytical and perceptual study on the role of acoustic features in realizing emotional speech. 369-372 - Sylvie J. L. Mozziconacci, Dik J. Hermes:
Expression of emotion and attitude through temporal speech variations. 373-378 - Klaus R. Scherer:
A cross-cultural investigation of emotion inferences from voice and speech: implications for speech technology. 379-382 - Bong-Seok Kang, Chul-Hee Han, Sang-Tae Lee, Dae Hee Youn, Chungyong Lee:
Speaker dependent emotion recognition using speech signals. 383-386
Generation and Synthesis of Spoken Language 1, 2
- Edmilson Morais, Paul Taylor, Fábio Violaro:
Concatenative text-to-speech synthesis based on prototype waveform interpolation (a time frequency approach). 387-390 - Ren-Hua Wang, Zhongke Ma, Wei Li, Donglai Zhu:
A corpus-based Chinese speech synthesis with contextual dependent unit selection. 391-394 - Geert Coorman, Justin Fackrell, Peter Rutten, Bert Van Coile:
Segment selection in the L&h Realspeak laboratory TTS system. 395-398 - Ren-Yuan Lyu, Zhen-hong Fu, Yuang-Chin Chiang, Hui-mei Liu:
A Taiwanese (min-nan) text-to-speech (TTS) system based on automatically generated synthetic units. 399-402 - Masayuki Yamada, Yasuo Okutani, Toshiaki Fukada, Takashi Aso, Yasuhiro Komori:
Puretalk: a high quality Japanese text-to-speech system. 403-406 - Ka Man Law, Tan Lee:
Using cross-syllable units for Cantonese speech synthesis. 407-410 - Alan W. Black, Kevin A. Lenzo:
Limited domain synthesis. 411-414 - Christine H. Nakatani, Jennifer Chu-Carroll:
Coupling dialogue and prosody computation in spoken dialogue generation. 415-418 - Tomio Takara, Kazuto Izumi, Keiichi Funaki:
A study on the pitch pattern of a singing voice synthesis system based on the cepstral method. 419-422 - Steve Pearson, Roland Kuhn, Steven Fincke, Nick Kibre:
Automatic methods for lexical stress assignment and syllabification. 423-426 - Olga Goubanova, Paul Taylor:
Using bayesian belief networks for model duration in text-to-speech systems. 427-430 - Diane Hirschfeld:
Comparing static and dynamic features for segmental cost function calculation in concatenative speech synthesis. 435-438 - Pratibha Jain, Hynek Hermansky:
Temporal patterns of critical-band spectrum for text-to-speech. 439-441
Speaker, Dialect, and Language Recognition 1, 2
- Eric H. C. Choi, Jianming Song:
Successive cohort selection (SCS) for text-independent speaker verification. 442-445 - Dat Tran, Michael Wagner:
Fuzzy normalisation methods for speaker verification. 446-449 - Yong Gu, Hans Jongebloed, Dorota J. Iskra, Els den Os, Lou Boves:
Speaker verification in operational environments - monitoring for improved service operation. 450-453 - Larry P. Heck, Nikki Mirghafori:
On-line unsupervised adaptation in speaker verification. 454-457 - P. Sivakumaran, Aladdin M. Ariyaeeinia, Jill A. Hewitt:
Multiple sub-band systems for speaker verification. 458-461 - Xiaoxing Liu, Baosheng Yuan, Yonghong Yan:
An orthogonal GMM based speaker verification system. 462-465 - Qin Jin, Alex Waibel:
A na ve de-lambing method for speaker identification. 466-469 - Douglas A. Reynolds, Robert B. Dunn, Jack McLaughlin:
The lincoln speaker recognition system: NIST eval2000. 470-473 - Aaron E. Rosenberg, Sarangarajan Parthasarathy, Julia Hirschberg, Stephen Whittaker:
Foldering voicemail messages by caller using text independent speaker recognition. 474-478 - Claude Montacié, Marie-José Caraty:
Structural framework for combining speaker recognition methods. 479-482 - Walter D. Andrews, Joseph P. Campbell, Douglas A. Reynolds:
Bootstrapping for speaker recognition. 483-486 - Bin Zhen, Xihong Wu, Zhimin Liu, Huisheng Chi:
On the importance of components of the MFCC in speech and speaker recognition. 487-490 - Thomas F. Quatieri, Robert B. Dunn, Douglas A. Reynolds:
On the influence of rate, pitch, and spectrum on automatic speaker recognition performance. 491-494 - Remco Teunen, Ben Shahshahani, Larry P. Heck:
A model-based transformational approach to robust speaker recognition. 495-498
Linguistics, Phonology, Phonetics, and Psycholinguistics (Poster)
- Amanda Miller-Ockhuizen, Bonny E. Sands:
Contrastive lateral clicks and variation in click types. 499-502 - Tomoko Matsui, Masaki Naito, Yoshinori Sagisaka, Kozo Okuda, Satoshi Nakamura:
Analysis of acoustic models trained on a large-scale Japanese speech database. 503-506 - Mahmood Bijankhan:
Farsi vowel compensatory lengthening: an experimental approach. 507-510 - Yue Wang, Joan A. Sereno, Allard Jongman, Joy Hirsch:
Cortical reorganization associated with the acquisition of Mandarin tones by american learners: an FMRI study. 511-514 - Sandra P. Whiteside, Rosemary A. Varley, T. Phillips, H. Garety:
The production of real and non-words in adult stutterers and non-stutterers: an acoustic study. 515-518 - Masaaki Shimizu, Masatake Dantsuji:
A new proposal of laryngeal features for the tonal system of Vietnamese. 519-522 - Hong Zhang, Bo Xu, Taiyi Huang:
How to choose training set for language modeling. 523-526 - Piero Cosi, John-Paul Hosom:
High performance "general purpose" phonetic recognition for Italian. 527-530 - Miren Karmele López de Ipiña, Inés Torres, Lourdes Oñederra, Amparo Varona, Nerea Ezeiza, Mikel Peñagarikano, M. Hernández, Luis Javier Rodríguez:
First approach to the selection of lexical units for continuous speech recognition of Basque. 531-534 - David W. Gow Jr.:
Assimilation, ambiguity, and the feature parsing problem. 535-538 - Sachin S. Kajarekar, Hynek Hermansky:
Optimization of units for continuous-digit recognition task. 539-542 - Ioana Vasilescu, François Pellegrino, Jean-Marie Hombert:
Perceptual features for the identification of Romance languages. 543-546 - Dawn M. Behne, Peter E. Czigler, Kirk P. H. Sullivan:
Perception of Swedish vowel quantity: tracing late stages of development. 547-550 - Ananlada Chotimongkol, Alan W. Black:
Statistically trained orthographic to sound models for Thai. 551-554 - Janice Fon, Keith Johnson:
Speech timing patterning as an indicator of discourse and syntactic boundaries. 555-558 - Amalia Arvaniti, Georgios Tserdanelis:
On the phonetics of geminates: evidence from Cypriot Greek. 559-562 - Hanny den Ouden, Carel van Wijk, Marc Swerts:
A simple procedure to clarify the relation between text and prosody. 563-566 - Kimiko Tsukada:
Effects of consonantal voicing on English diphthongs: a comparison of L1 and L2 production. 567-570 - Nigel Ward:
The challenge of non-lexical speech sounds. 571-574 - Yousif A. El-Imam:
A method to synthesize Arabic from short phonetic. 575-578 - Mauricio C. Schramm, Luis Felipe R. Freitas, Adriano Zanuz, Dante Barone:
A brazilian portuguese language corpus development. 579-582 - Cécile Colin, Monique Radeau, Didier Demolin, Alain Soquet:
Visual lipreading of voicing for French stop consonants. 583-586 - Yang Chen, Michael Robb:
Acoustic features of vowel production in Mandarin speakers of English. 587-590 - Robert S. Belvin, Ron Burns, Cheryl Hein:
Spoken language navigation systems for drivers. 591-594 - Fang Chen, Baozong Yuan:
An approach to intelligent Chinese dialogue system. 595-598 - Huei-Ming Wang, Yi-Chung Lin:
Goal-oriented table-driven design for dialogue manager. 599-602 - Alexandros Potamianos, Egbert Ammicht, Hong-Kwang Jeff Kuo:
Dialogue management in the Bell Labs communicator system. 603-606 - Jiang Han, Yong Wang:
Dialogue management based on a hierarchical task structure. 607-610 - Johanneke Caspers:
Melodic characteristics of backchannels in Dutch map task dialogues. 611-614 - Marc Swerts, Diane J. Litman, Julia Hirschberg:
Corrections in spoken dialogue systems. 615-618 - John Fry:
F0 correlates of topic and subject in spontaneous Japanese speech. 619-622 - Mutsuko Tomokiyo, Solange Hollard:
Specification of communicative acts of utterances based on dialogue corpus analysis. 623-627 - Hiroaki Noguchi, Yasuhiro Katagiri, Yasuharu Den:
An experimental verification of the prosodic/lexical effects on the occurrence of backchannels. 628-631 - Tsutomu Sato, John A. Maidment:
The acoustic characteristics of Japanese identical vowel sequences in connected speech. 632-635
Spoken and Multi-Modal Dialogue Systems
- Shrikanth S. Narayanan, Giuseppe Di Fabbrizio, Candace A. Kamm, James Hubbell, Bruce Buntschuh, P. Ruscitti, Jerry H. Wright:
Effects of dialog initiative and multi-modal presentation strategies on large directory information access. 636-639 - William Thompson, Harry Bliss:
A declarative framework for building compositional dialog modules. 640-643 - Kuansan Wang:
A plan-based dialog system with probabilistic inferences. 644-647 - Kazunori Komatani, Tatsuya Kawahara:
Generating effective confirmation and guidance using two-level confidence measures for dialogue systems. 648 - Nikko Ström, Stephanie Seneff:
Intelligent barge-in in conversational systems. 652-655 - Andrew P. Breen, Barry Eggleton, Gavin E. Churcher, Paul Deans, Simon Downey:
A system for the research into multi-modal man-machine communication within a virtual environment. 656-659 - Fabio Brugnara, Mauro Cettolo, Marcello Federico, Diego Giuliani:
Advances in automatic transcription of Italian broadcast news. 660-663 - Shui-Lung Chuang, Hsiao-Tieh Pu, Wen-Hsiang Lu, Lee-Feng Chien:
Live thesaurus construction for interactive voice-based web search. 664-667 - Yoshimi Suzuki, Fumiyo Fukumoto, Yoshihiro Sekiguchi:
Selecting TV news stories and newswire articles related to a target article of newswire using SVM. 668-671 - Kenney Ng:
Towards an integrated approach for spoken document retrieval. 672-675 - Beth Logan, Pedro J. Moreno, Jean-Manuel Van Thong, Edward W. D. Whittaker:
An experimental study of an audio indexing system for the web. 676-679 - Rong Jin, Alexander G. Hauptmann:
Title generation for spoken broadcast news using a training corpus. 680-683 - Manfred Weber, Thomas Kemp:
Evaluating different information retrieval algorithms on real-world data. 684-687 - Konstantinos Koumpis, Steve Renals:
Transcription and summarization of voicemail speech. 688-691 - W. C. Tsai, Y. C. Chu:
Robust rejection for embedded systems. 692-695 - Sharon L. Oviatt:
Multimodal signal processing in naturalistic noisy environments. 696-699 - Joyce Yue Chai, Sylvie Levesque, Malgorzata Budzikowska, Veronika Horvath, Nanda Kambhatla, Nicolas Nicolov, Wlodek Zadrozny:
A multi-modal dialog system for business transactions. 700-703 - Jiang Han, Yonghong Yan, Zhiwei Lin, Yong Wang, Jian Liu, Danjun Liu, Zhihui Wang:
Office message center - a spoken dialogue system. 704-706 - Noboru Miyazaki, Jun-ichi Hirasawa, Mikio Nakano, Kiyoaki Aikawa:
A new method for understanding sequences of utterances by multiple speakers. 707-710 - Hideaki Kikuchi, Katsuhiko Shirai:
Improvement of dialogue efficiency by dialogue control model according to performance of processes. 711-714 - Chao Wang, D. Scott Cyphers, Xiaolong Mou, Joseph Polifroni, Stephanie Seneff, Jon Yi, Victor Zue:
MUXING: a telephone-access Mandarin conversational system. 715-718 - Markku Turunen, Jaakko Hakulinen:
Jaspis - a framework for multilingual adaptive speech applications. 719-722 - Bryan L. Pellom, Wayne H. Ward, Sameer S. Pradhan:
The CU communicator: an architecture for dialogue systems. 723-726 - Vildan Bilici, Emiel Krahmer, Saskia te Riele, Raymond N. J. Veldhuis:
Preferred modalities in dialogue systems. 727-730 - Frédéric Béchet, Elisabeth den Os, Lou Boves, Jürgen Sienel:
Introduction to the IST-HLT project speech-driven multimodal automatic directory assistance (SMADA). 731-734 - Crusoe Mao, Tony Tuo, Danjun Liu:
Using HPSG to represent multi-modal grammar in multi-modal dialogue. 735-738 - Kohji Dohsaka, Norihito Yasuda, Noboru Miyazaki, Mikio Nakano, Kiyoaki Aikawa:
An efficient dialogue control method under system²s limited knowledge. 739-742 - Ying Cheng, Anurag Gupta, Raymond H. Lee:
A distributed spoken user interface based on open agent architecture (OAA). 743-746
Speech, Facial Expression, and Gesture
- Stephen M. Chu, Thomas S. Huang:
Bimodal speech recognition using coupled hidden Markov models. 747-750 - Jiyong Ma, Wen Gao:
A parallel multi-stream model for sign language recognition. 751-754 - Lionel Revéret, Gérard Bailly, Pierre Badin:
MOTHER: a new generation of talking heads providing a flexible articulatory control for video-realistic speech animation. 755-758 - Steve Minnis, Andrew P. Breen:
Modeling visual coarticulation in synthetic talking heads using a lip motion unit inventory with concatenative synthesis. 759-762
Generation and Synthesis of Spoken Language 3
- Hua Wu, Taiyi Huang, Bo Xu:
A generation system for Chinese texts. 763-767 - Stephanie Seneff, Joseph Polifroni:
Formal and natural language generation in the Mercury conversational system. 767-770 - Takashi Saito, Masaharu Sakamoto:
A method of creating a new speaker²s voicefont in a text-to-speech system. 771-774 - Jun Huang, Stephen E. Levinson, Mark Hasegawa-Johnson:
Signal approximation in Hilbert space and its application on articulatory speech synthesis. 775-778 - Nobuaki Minematsu, Seiichi Nakagawa:
Quality improvement of PSOLA analysis-synthesis using partial zero-phase conversion. 779-782 - Hanna Lindgren, Jessica Granberg:
A machine learning approach to Swedish word pronunciation. 783-786 - Takahiro Ohtsuka, Hideki Kasuya:
An improved speech analysis-synthesis algorithm based on the autoregressive with exogenous input speech production model. 787-790
Speaker, Dialect, and Language Recognition 3
- Kuo-Hwei Yuo, Tai-Hwei Hwang, Hsiao-Chuan Wang:
Combination of temporal trajectory filtering and projection measure for robust speaker identification. 791-794 - Yunxin Zhao, Xiao Zhang, Xiaodong He, Laura Schopp:
A combined adaptive and decision tree based speech separation technique for telemedicine applications. 795-798 - Olivier Bellot, Driss Matrouf, Téva Merlin, Jean-François Bonastre:
Additive and convolutional noises compensation for speaker recognition. 799-802 - Frédéric Beaugendre, Tom Claes, Hugo Van hamme:
Dialect adaptation for Mandarin Chinese speech recognition. 803-806 - Klaus R. Scherer, Tom Johnstone, Gudrun Klasmeyer, Thomas Bänziger:
Can automatic speaker verification be improved by training the algorithms on emotional speech? 807-810 - Zhong-Hua Wang, Cheng Wu, David M. Lubensky:
New distance measures for text-independent speaker identification. 811-814
Miscellaneous Topics 2 [M, J]
- Fengguang Zhao, Prabhu Raghavan, Sunil K. Gupta, Ziyi Lu, Wentao Gu:
Automatic speech recognition in Mandarin for embedded platforms. 815-818 - Husheng Li, Jia Liu, Runsheng Liu:
Confidence measure based unsupervised speaker adaptation. 819-822 - Javier Macías Guarasa, Javier Ferreiros, José Colás, Ascensión Gallardo-Antolín, José Manuel Pardo:
Improved variable preselection list length estimation using NNs in a large vocabulary telephone speech recognition system. 823-826 - Ascensión Gallardo-Antolín, Javier Ferreiros, Javier Macías Guarasa, Ricardo de Córdoba, José Manuel Pardo:
Incorporating multiple-HMM acoustic modeling in a modular large vocabulary speech recognition system in telephone environment. 827-830 - Janne Suontausta, Juha Häkkinen:
Decision tree based text-to-phoneme mapping for speech recognition. 831-834 - Jeff Meunier:
Reduced traceback matrix storage for small footprint model alignment. 835-838 - Claudio Vair, Luciano Fissore, Pietro Laface:
Dynamic adaptation of vocabulary independent HMMs to an application environment. 839-842 - Roberto Gemello, Loreta Moisa, Pietro Laface:
Synergy of spectral and perceptual features in multi-source connectionist speech recognition. 843-846 - Ramalingam Hariharan, Olli Viikki:
High performance connected digit recognition through gender-dependent acoustic modelling and vocal tract length normalisation. 847-850 - Ellen Eide, Benoît Maison, Dimitri Kanevsky, Peder A. Olsen, Scott Saobing Chen, Lidia Mangu, Mark J. F. Gales, Miroslav Novak, Ramesh A. Gopinath:
Transcription of broadcast news with a time constraint: IBM's 10xRT HUB4 system. 851-854 - Geoffrey Zweig, Mukund Padmanabhan:
Exact alpha-beta computation in logarithmic space with application to MAP word graph construction. 855-858 - Kazumasa Yamamoto, Seiichi Nakagawa:
Relationship among speaking style, inter-phoneme's distance and speech recognition performance. 859-862 - Rubén San Segundo, José Colás, Javier Ferreiros, Javier Macías Guarasa, Juan Miguel Pardo:
Spanish recogniser of continuously spelled names over the telephone. 863-866 - Frank Seide, Nick J.-C. Wang:
Two-stream modeling of Mandarin tones. 867-870 - Seyyed Ali Seyyed Salehi:
A neural network speech recognizer based on the both acoustic steady portions and transitions. 871-874 - Marc Hofmann, Manfred K. Lang:
Belief networks for a syntactic and semantic analysis of spoken utterances for speech understanding. 875-878 - Jiping Sun, Roberto Togneri, Li Deng:
A robust speech understanding system using conceptual relational grammar. 879-882 - Wai H. Lau, Tan Lee, Yiu Wing Wong, P. C. Ching:
Incorporating tone information into Cantonese large-vocabulary continuous speech recognition. 883-886 - Janez Kaiser, Bogomir Horvat, Zdravko Kacic:
A novel loss function for the overall risk criterion based discriminative training of HMM models. 887-890 - Mirjam Sepesy Maucec, Zdravko Kacic, Bogomir Horvat:
Looking for topic similarities of highly inflected languages for language model adaptation. 891-894 - David Janiszek, Frédéric Béchet, Renato de Mori:
Integrating MAP and linear transformation for language model adaptation. 895-898 - Beng Tiong Tan, Yong Gu, Trevor Thomas:
Utterance verification based speech recognition system. 899-902 - Rathinavelu Chengalvarayan:
Use of linear extrapolation based linear predictive cepstral features (LE-LPCC) for Tamil speech recognition. 903-906 - Yoshinori Atake, Toshio Irino, Hideki Kawahara, Jinlin Lu, Satoshi Nakamura, Kiyohiro Shikano:
Robust fundamental frequency estimation using instantaneous frequencies of harmonic components. 907-910 - Amparo Varona, Inés Torres, Miren Karmele López de Ipiña, Luis Javier Rodríguez:
Integrating different acoustic and syntactic language models in a continuous speech recognition system. 911-914 - Holger Schwenk, Jean-Luc Gauvain:
Combining multiple speech recognizers using voting and language model information. 915-918 - Keisuke Watanabe, Yasushi Ishikawa:
Dialogue management based on inferred behavioral goal - improving the accuracy of understanding by dialogue context -. 919-922 - Ralf Schlüter, Frank Wessel, Hermann Ney:
Speech recognition using context conditional word posterior probabilities. 923-926 - Hugo Meinedo, João Paulo Neto:
The use of syllable segmentation information in continuous speech recognition hybrid systems applied to the Portuguese language. 927-930 - Hugo Meinedo, João Paulo Neto:
Combination of acoustic models in continuous speech recognition hybrid systems. 931-934 - David A. van Leeuwen, Sander J. van Wijngaarden:
Automatic speech recognition of non-native speakers using consonant-vowel-consonant (CVC) words. 935-938 - Gang Zhao, Hong Xu:
Understanding Chinese in spoken dialogue systems. 939-942 - Frédéric Berthommier, Hervé Glotin, Emmanuel Tessier:
A front-end using the harmonicity cue for speech enhancement in loud noise. 943-946 - Qiru Zhou, Sergey Kosenko:
Lucent automatic speech recognition: a speech recognition engine for internet and telephony srvice applications. 947-950 - Todd A. Stephenson, Hervé Bourlard, Samy Bengio, Andrew C. Morris:
Automatic speech recognition using dynamic bayesian networks with both acoustic and articulatory variables. 951-954 - Subrata K. Das, David M. Lubensky:
Towards robust telephony speech recognition in office and automobile environments. 955-958 - Hiroaki Kojima, Kazuyo Tanaka:
Extracting phonological chunks based on piecewise linear segment lattices. 959-962 - Lucian Galescu, James F. Allen:
Evaluating hierarchical hybrid statistical language models. 963-966 - Jun Ogata, Yasuo Ariki:
An efficient lexical tree search for large vocabulary continuous speech recognition. 967-970 - Bin Jia, Xiaoyan Zhu, Yupin Luo, Dongcheng Hu:
Reliability evaluation of speech recognition in acoustic modeling. 971-974 - Ching X. Xu:
Using GMM for voiced/voiceless segmentation and tone decision in Mandarin continuous speech recognition. 975-978 - Chi H. Yim, Oscar C. Au, Wanggen Wan, Cyan L. Keung, Carrson C. Fung:
Auditory spectrum based features (ASBF) for robust speech recognition. 979-982 - Eric Chang, Jian-Lai Zhou, Shuo Di, Chao Huang, Kai-Fu Lee:
Large vocabulary Mandarin speech recognition with different approaches in modeling tones. 983-986 - Kallirroi Georgila, Kyriakos N. Sgarbas, Nikos Fakotakis, George Kokkinakis:
Fast very large vocabulary recognition based on compact DAWG-structured language models. 987-990 - Robert Eklund:
Crosslinguistic disfluency modeling: a comparative analysis of Swedish and tok pisin human-human ATIS dialogues. 991-994 - Shiro Terashima, Kazuya Takeda, Fumitada Itakura:
Vector space representation of language probabilities through SVD of n-gram matrix. 995-998 - Yoshihide Kato, Shigeki Matsubara, Katsuhiko Toyama, Yasuyoshi Inagaki:
Spoken language parsing based on incremental disambiguation. 999-1002 - Hiroshi Shimodaira, Yutaka Kato, Toshihiko Akae, Mitsuru Nakai, Shigeki Sagayama:
Jacobian adaptation of HMM with initial model selection for noisy speech recognition. 1003-1006 - Han Shu, Chuck Wooters, Owen Kimball, Thomas Colthurst, Fred Richardson, Spyros Matsoukas, Herbert Gish:
The BBN Byblos 2000 conversational Mandarin LVCSR system. 1007-1010 - Thomas Colthurst, Owen Kimball, Fred Richardson, Han Shu, Chuck Wooters, Rukmini Iyer, Herbert Gish:
The 2000 BBN Byblos LVCSR system. 1011-1014 - Langzhou Chen, Lori Lamel, Gilles Adda, Jean-Luc Gauvain:
Broadcast news transcription in Mandarin. 1015-1018 - Yang Li, Tong Zhang, Stephen E. Levinson:
Word concept model: a knowledge representation for dialogue agents. 1019-1022 - Chiyomi Miyajima, Keiichi Tokuda, Tadashi Kitamura:
Audio-visual speech recognition using MCE-based hmms and model-dependent stream weights. 1023-1026 - Hiroaki Nanjo, Akinobu Lee, Tatsuya Kawahara:
Automatic diagnosis of recognition errors in large vocabulary continuous speech recognition systems. 1027-1030 - Yuang-Chin Chiang, Zhi-Siang Yang, Ren-Yuan Lyu:
Taiwanese corpus collection via continuous speech recognition tool. 1031-1034 - Baosheng Yuan, Qingwei Zhao, Qing Guo, Xiangdong Zhang, Zhiwei Lin:
Optimal maximum likelihood on phonetic decision tree acoustic model for LVCSR. 1035-1038 - Konstantin P. Markov, Satoshi Nakamura:
Frame level likelihood transformations for ASR and utterance verification. 1038-1041 - Timothy J. Hazen, Theresa Burianek, Joseph Polifroni, Stephanie Seneff:
Integrating recognition confidence scoring with language understanding and dialogue modeling. 1042-1045 - Yibiao Yu, Heming Zhao:
Speech recognition based on estimation of mutual information. 1046-1049 - Qing Guo, Yonghong Yan, Zhiwei Lin, Baosheng Yuan, Qingwei Zhao, Jian Liu:
Keyword spotting in auto-attendant system. 1050-1052 - Weimin Ren, Chengfa Wang, Wen Gao, Jinpei Xu:
A new approach for modeling OOV words. 1053-1056 - Rachida El Méliani, Douglas D. O'Shaughnessy:
Speech recognition using error spotting. 1057-1060 - Chung-Ho Yang, Ming-Shiun Hsieh:
Robust endpoint detection for in-car speech recognition. 1061-1064 - Jouji Miwa, Masaru Kumagai:
Internet speech analysis system using e-mail and web technology. 1065-1068 - Marco Loog, Reinhold Haeb-Umbach:
Multi-class linear dimension reduction by generalized Fisher criteria. 1069-1072 - Wendy J. Holmes:
Improving the representation of time structure in front-ends for automatic speech recognition. 1073-1076 - Katrin Kirchhoff:
Speech analysis by rule extraction from trained artificial neural networks. 1077-1080 - Jaishree Venugopal, Stephen A. Zahorian, Montri Karnjanadecha:
Minimum mean square error spectral peak envelope estimation for automatic vowel classification. 1081-1084 - Cyan L. Keung, Oscar C. Au, Chi H. Yim, Carrson C. Fung:
Probabilistic compensation of unreliable feature components for robust speech recognition. 1085-1087 - Congxiu Wang, Qihu Li, Guoying Zhao, Li Yin, Shuai Hao, Da Meng:
A new tone conversion method for Mandarin by an adaptive linear prediction analysis. 1088-1091
Volume 3
Trans-Modal and Multi-Modal Human-Computer Interaction (Special Session)
- Sharon L. Oviatt:
Multimodal interface research: a science without borders. 1-6 - Kevin G. Munhall, Christian Kroos, Takaaki Kuratate, J. Lucero, Michel Pitermann, Eric Vatikiotis-Bateson, Hani Yehia:
Studies of audiovisual speech perception using production-based animation. 7-10 - Chalapathy Neti, Giridharan Iyengar, Gerasimos Potamianos, Andrew W. Senior, Benoît Maison:
Perceptual interfaces for information interaction: joint processing of audio and visual information for human-computer interaction. 11-14 - Wen Gao, Jiyong Ma, Rui Wang, Hongxun Yao:
Towards robust lipreading. 15-19 - Satoshi Nakamura, Hidetoshi Ito, Kiyohiro Shikano:
Stream weight optimization of speech and lip image sequence for audio-visual speech recognition. 20-24 - Shinji Sako, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura:
HMM-based text-to-audio-visual speech synthesis. 25-28 - Jill A. Hewitt, Andi Bateman, Andrew Lambourne, Aladdin M. Ariyaeeinia, P. Sivakumaran:
Real-time speech-generated subtitles: problems and solutions. 29-32 - Xuedong Huang, Alex Acero, Ciprian Chelba, Li Deng, Doug Duchene, Joshua Goodman, Hsiao-Wuen Hon, Derek Jacoby, Li Jiang, Ricky Loynd, Milind Mahajan, Peter Mau, Scott Meredith, Salman Mughal, Salvado Neto, Mike Plumpe, Kuansan Wang, Ye-Yi Wang:
Mipad: a next generation PDA prototype. 33-36 - Fei Huang, Jie Yang, Alex Waibel:
Dialogue management for multimodal user registration. 37-40 - Lynne E. Bernstein:
Segmental optical phonetics for human and machine speech processing. 43-46 - Umavasee Thathong, Somchai Jitapunkul, Visarut Ahkuputra, Ekkarit Maneenoi, Boonchai Thampanitchawong:
Classification of Thai consonant naming using Thai tone. 47-50
Signal Analysis, Processing, and Feature Extraction 1, 2
- Qi Li, Frank K. Soong, Olivier Siohan:
A high-performance auditory feature for robust speech recognition. 51-54 - Kun Xia, Carol Y. Espy-Wilson:
A new strategy of formant tracking based on dynamic programming. 55-58 - Xugang Lu, Gang Li, Lipo Wang:
Dominant subspace analysis for auditory spectrum. 59-62 - Ilyas Potamitis, Nikos Fakotakis, George Kokkinakis:
Spectral and cepstral projection bases constructed by independent component analysis. 63-66 - Sacha Krstulovic:
Relating LPC modeling to a factor-based articulatory model. 67-70 - Michael L. Shire, Barry Y. Chen:
On data-derived temporal processing in speech feature extraction. 71-74 - George Saon, Mukund Padmanabhan:
Minimum Bayes error feature selection. 75-78 - Daniel P. W. Ellis, Jeff A. Bilmes:
Using mutual information to design feature combinations. 79-82 - Seungjin Choi, Heonseok Hong, Hervé Glotin, Frédéric Berthommier:
Multichannel signal separation for cocktail party speech recognition: a dynamic recurrent network. 83-86 - V. Kamakshi Prasad, Hema A. Murthy:
An automatic algorithm for segmenting and labelling a connected digit sequence. 87-90 - Hui Yan, Xuegong Zhang, Yanda Li, Liqin Shen, Weibin Zhu:
The signal reconstruction of speech by KPCA. 91-93 - Hiroshi Saruwatari, Satoshi Kurita, Kazuya Takeda, Fumitada Itakura, Kiyohiro Shikano:
Blind source separation based on subband ICA and beamforming. 94-97 - Claudio Estienne, Patricia A. Pelle:
A synchrony front-end using phase-locked-loop techniques. 98-101 - Javier Hernando:
On the use of filter-bank energies driven from the autocorrelation sequence for noisy speech recognition. 102-105
Language Modeling
- Rens Bod:
Combining semantic and syntactic structure for language modeling. 106-109 - Joshua Goodman, Jianfeng Gao:
Language model size reduction by pruning and clustering. 110-113 - Jun Wu, Sanjeev Khudanpur:
Efficient training methods for maximum entropy language modeling. 114-118 - Sabine Deligne:
Statistical language modeling with a class based n-multigram model. 119-122 - Koichi Tanigaki, Hirofumi Yamamoto, Yoshinori Sagisaka:
A hierarchical language model incorporating class-dependent word models for OOV words recognition. 123-126 - Fang Zheng, Jian Wu, Wenhu Wu:
Input Chinese sentences using digits. 127-130
Acoustic Modeling
- Matthew Richardson, Jeff A. Bilmes, Chris Diorio:
Hidden-articulator Markov models: performance improvements and robustness to noise. 131-134 - Eric D. Sandness, I. Lee Hetherington:
Keyword-based discriminative training of acoustic models. 135-138 - Vaibhava Goel, Shankar Kumar, William Byrne:
Segmental minimum Bayes-risk ASR voting strategies. 139-142 - Harriet J. Nock, Steve J. Young:
Loosely coupled HMMs for ASR. 143-146 - Katrin Weber, Samy Bengio, Hervé Bourlard:
HMM2- a novel approach to HMM emission probability estimation. 147-150 - Rita Singh, Bhiksha Raj, Richard M. Stern:
Structured redefinition of sound units by merging and splitting for improved speech recognition. 151-154 - Vincent Arsigny, Gérard Chollet, Guillaume Gravier, Marc Sigelle:
Speech modeling with state constrained Markov fields over frequency bands. 155-158
Prosody (Poster)
- Weibin Zhu, Liqin Shen, Xiaochuan Miu:
Duration modeling for Chinese synthesis from C-toBI labeled corpus. 159-162 - Bei Wang, Bo Zheng, Shinan Lu, Jianfen Cao, Yufang Yang:
The pitch movement of word stress in Chinese. 163-166 - Michiko Watanabe, Carlos Toshinori Ishi:
The distribution of fillers in lectures in the Japanese language. 167-170 - Huhe Harnud, Yuling Zheng, Jiayou Chen:
Research on stress in bisyllsblic words of Mongolian. 171-174 - Kazunori Imoto, Masatake Dantsuji, Tatsuya Kawahara:
Modelling of the perception of English sentence stress for computer-assisted language learning. 175-178 - Jeska Buhmann, Halewijn Vereecken, Justin Fackrell, Jean-Pierre Martens, Bert Van Coile:
Data driven intonation modelling of 6 languages. 179-182 - Laurent Blin, Mike Edgington:
Prosody prediction using a tree-structure similarity metric. 183-186 - Carlos Teixeira, Horacio Franco, Elizabeth Shriberg, Kristin Precoda, M. Kemal Sönmez:
Prosodic features for automatic text-independent evaluation of degree of nativeness for language learners. 187-190 - Nobuaki Minematsu, Seiichi Nakagawa:
Instantaneous estimation of prosodic pronunciation habits for Japanese students to learn English pronunciation. 191-194 - Jinfu Ni, Keikichi Hirose:
Synthesis of fundamental FDrequency contours of standard Chinese sentences from tone sandhi and focus conditions. 195-198 - Yiqing Zu, Xiaoxia Chan, Aijun Li, Wu Hua, Guohua Sun:
Syllable duration and its functions in standard Chinese discourse. 199-202 - Bleicke Holm, Gérard Bailly:
Generating prosody by superposing multi-parametric overlapping contours. 203-206 - Raymond N. J. Veldhuis:
Consistent pitch marking. 207-210 - Sun-Ah Jun, Sook-Hyang Lee, Keeho Kim, Yong-Ju Lee:
Labeler agreement in transcribing korean intonation with K-toBI. 211-214 - Yukiyoshi Hirose, Kazuhiko Ozeki, Kazuyuki Takagi:
Effectiveness of prosodic features in syntactic analysis of read Japanese sentences. 215-218 - Mieko Banno:
A study of F0 declination in Japanese: towards a discourse model of prosodic structure. 219-222 - Atsuhiro Sakurai, Nobuaki Minematsu, Keikichi Hirose:
Data-driven intonation modeling using a neural network and a command response model. 223-226 - Çaglayan Erdem, Martin Holzapfel, Rüdiger Hoffmann:
Natural F0 contours with a new neural-network-hybrid approach. 227-230 - Justin Fackrell, Halewijn Vereecken, Jeska Buhmann, Jean-Pierre Martens, Bert Van Coile:
Prosodic variation with text type. 231-234 - Ann K. Syrdal, Julia Tevis McGory:
Inter-transcriber reliability of toBI prosodic labeling. 235-238 - Greg Kochanski, Chilin Shih:
Stem-ML: language-independent prosody description. 239-242 - Minghui Dong, Kim-Teng Lua:
Using prosody database in Chinese speech synthesis. 243-246 - Donna Erickson, Kikuo Maekawa, Michiko Hashi, Jianwu Dang:
Some articulatory and acoustic changes associated with emphasis in spoken English. 247-250 - Esther Janse, Anke Sennema, Anneke W. Slis:
Fast speech timing in Dutch: durational correlates of lexical stress and pitch accent. 251-254 - Makoto Hiroshige, Kantaro Suzuki, Kenji Araki, Koji Tochinai:
On perception of word-based local speech rate in Japanese without focusing attention. 255-258 - Atsuhiro Sakurai, Koji Iwano, Keikichi Hirose:
Modeling and generation of accentual phrase F0 contours based on discrete HMMs synchronized at mora-unit transitions. 259-262 - Philippa H. Louw, Justus C. Roux, Elizabeth C. Botha:
Synthesizing prosody for commands in a Xhosa TTS system. 263-266
Generation and Synthesis of Spoken Language (Poster)
- Costas Christogiannis, Yiannis Stavroulas, Yiannis Vamvakoulas, Theodora A. Varvarigou, Agatha Zappa, Chilin Shih, Amalia Arvaniti:
Design and implementation of a Greek text-to-speech system based on concatenative synthesis. 267-270 - Lauren Baptist, Stephanie Seneff:
GENESIS-II: a versatile system for language generation in conversational system applications. 271-274 - Eun-Kyoung Kim, Yung-Hwan Oh:
New analysis method for harmonic plus noise model based on time-domain periodicity score. 275-278 - Tomoki Toda, Jinlin Lu, Hiroshi Saruwatari, Kiyohiro Shikano:
Straight-based voice conversion algorithm based on Gaussian mixture model. 279-282
Generation and Synthesis of Spoken Language (Poster)
- Marion Libossek, Florian Schiel:
Syllable-based text-to-phoneme conversion for German. 283-286
Generation and Synthesis of Spoken Language (Poster)
- Horst-Udo Hain:
A hybrid approach for grapheme-to-phoneme conversion based on a combination of partial string matching and a neural network. 291-294 - Hans G. Tillmann, Hartmut R. Pfitzinger:
Parametric high definition (PHD) speech synthesis-by-analysis: the development of a fundamentally new system creating connected speech by modifying lexically-represented language units. 295-297 - Chul Hong Kwon, Minkyu Lee, Joseph P. Olive:
A new synthesis algorithm using phase information for TTS systems. 298-301 - Johan Wouters, Michael W. Macon:
Unit fusion for concatenative speech synthesis. 302-305 - Kevin A. Lenzo, Alan W. Black:
Diphone collection and synthesis. 306-309 - Thomas Portele:
Natural language generation for spoken dialogue. 310-313 - Alistair Conkie, Mark C. Beutnagel, Ann K. Syrdal, Philip E. Brown:
Preselection of candidate units in a unit selection-based text-to-speech synthesis system. 314-317 - Kåre Jean Jensen, Søren Riis:
Self-organizing letter code-book for text-to-phoneme neural network model. 318-321 - Jon R. W. Yi, James R. Glass, I. Lee Hetherington:
A flexible, scalable finite-state transducer architecture for corpus-based concatenative speech synthesis. 322-325 - Changfu Wang, Hiroya Fujisaki, Ryou Tomana, Sumio Ohno:
Analysis of fundamental frequency contours of standard Chinese in terms of the command-response model and its application to synthesis by rule of intonation. 326-329 - Toshio Hirai, Seiichi Tenpaku, Kiyohiro Shikano:
Manipulating speech pitch periods according to optimal insertion/deletion position in residual signal for intonation control in speech synthesis. 330-333 - Pradit Mittrapiyanuruk, Chatchawarn Hansakunbuntheung, Virongrong Tesprasit, Virach Sornlertlamvanich:
Improving naturalness of Thai text-to-speech synthesis by prosodic rule. 334-337 - Dawei Xu, Hiroki Mori, Hideki Kasuya:
Word-level F0 range in Mandarin Chinese and its application to inserting words into a sentence. 338-341 - Mitsuaki Isogai, Kimihito Tanaka, Satoshi Takano, Hideyuki Mizuno, Masanobu Abe, Shin'ya Nakajima:
A new Japanese TTS system based on speech-prosody database and speech modification. 342-345 - Rubén San Segundo, Juan Manuel Montero, Ricardo de Córdoba, Juana M. Gutiérrez-Arriola:
Stress assignment in Spanish proper names. 346-349 - Zhengyu Niu, Peiqi Chai:
Segmentation of prosodic phrases for improving the naturalness of synthesized Mandarin Chinese speech. 350-353 - Xiaohu Liu, Douglas D. O'Shaughnessy:
Practical language modeling: an interpolating method. 354-357 - Gongjun Li, Na Dong, Toshiro Ishikawa:
Combination of different n-grams based on their different assumptions. 358-361 - Nobuo Kawaguchi, Shigeki Matsubara, Hiroyuki Iwa, Shoji Kajita, Kazuya Takeda, Fumitada Itakura, Yasuyoshi Inagaki:
Construction of speech corpus in moving car environment. 362-365 - Yue-Shi Lee, Hsin-Hsi Chen:
Parsing spoken dialogues. 366-369 - Børge Lindberg, Finn Tore Johansen, Narada D. Warakagoda, Gunnar Lehtinen, Zdravko Kacic, Andrej Zgank, Kjell Elenius, Giampiero Salvi:
A Noise Robust Multilingual Reference Recogniser Based on Speechdat(II). 370-373 - Muhua Lv, Lianhong Cai:
The design and application of a speech database for Chinese TTS system. 378-381 - Rathinavelu Chengalvarayan:
Use of multiple classifiers for speech recognition in wireless CDMA network environments. 382-385 - Alexander Franz, Keiko Horiguchi, Lei Duan:
An imperative programming language for spoken language translation. 386-389 - Yumi Wakita, Kenji Matsui, Yoshinori Sagisaka:
Fine keyword clustering using a thesaurus and example sentences for speech translation. 390-393 - Junlan Feng, Xianfang Wang, Limin Du:
Data collection and processing in a Chinese spontaneous speech corpus IIS_CSS. 394-397 - Yasuyuki Aizawa, Shigeki Matsubara, Nobuo Kawaguchi, Katsuhiko Toyama, Yasuyoshi Inagaki:
Spoken language corpus for machine interpretation research. 398-401
Rules and Corpora (Special Session)
- Jan P. H. van Santen, Michael W. Macon, Andrew Cronk, John-Paul Hosom, Alexander Kain, Vincent Pagel, Johan Wouters:
When will synthetic speech sound human: role of rules and data. 402-409 - Ann K. Syrdal, Colin W. Wightman, Alistair Conkie, Yannis Stylianou, Marc C. Beutnagel, Juergen Schroeter, Volker Strom, Ki-Seung Lee, Matthew J. Makashay:
Corpus-based techniques in the AT&t nextgen synthesis system. 410-415 - Nick Campbell:
Limitations to concatenative speech synthesis. 416-419 - Hisashi Kawai, Seiichi Yamamoto, Norio Higuchi, Tohru Shimizu:
A design method of speech corpus for text-to-speech synthesis taking account of prosody. 420-425 - Richard Sproat:
Corpus-based methods and hand-built methods. 426-428 - Michael A. Picheny:
Heredity and environment in speech recognition: the role of a priori information vs. data. 429-433 - Haruo Kubozono:
A constraint-based analysis of compound accent in Japanese. 438-441 - Naoto Iwahashi:
Language acquisition through a human-robot interface. 442-447 - Yoshinori Sagisaka, Hirofumi Yamamoto, Minoru Tsuzaki, Hiroaki Kato:
Rules, but what for? - rule description as efficient and robust abstraction of corpora and optimal fitting to applications -. 448-451
Perception and Comprehension of Spoken Language 1, 2
- Veronika Makarova:
Cross-linguistic aspects of intonation perception. 452-453 - Haruo Kubozono, Shosuke Haraguchi:
Visual information and the perception of prosody. 454-457 - Masato Akagi, Hironori Kitakaze:
Perception of synthesized singing voices with fine fluctuations in their fundamental frequency contours. 458-461 - Kalle J. Palomäki, Paavo Alku, Ville Mäkinen, Patrick J. C. May, Hannu Tiitinen:
Neuromagnetic study on localization of speech sounds. 462-465 - Yukiyoshi Hirose, Kazuhiko Kakehi:
Perception of identical vowel sequences in Japanese conversational speech. 466-469 - Santiago Fernández, Sergio Feijóo:
Acoustic cues to perception of vowel quality. 470-473 - Esther Klabbers, Raymond N. J. Veldhuis, Kim Koppen:
A solution to the reduction of concatenation artefacts in speech synthesis. 474-477 - Jhing-Fa Wang, Hsien-Chang Wang, Kin-Nan Lee, Chieh-Yi Huang:
Domain-unconstrained language understanding based on CKIP-auto tag, how-net, and ART. 478-481 - Chris Powell, Mary Zajicek, David Duce:
The generation of representations of word meanings from dictionaries. 482-485 - Po-Chui Luk, Helen M. Meng, Filung Wang:
Grammar partitioning and parser composition for natural language understanding. 486-489 - Jennifer Lai, Omer Tsimhoni, Paul A. Green:
Comprehension of synthesized speech while driving and in the lab. 490-493 - Michael D. Tyler, Denis K. Burnham:
Orthographic influences on initial phoneme addition and deletion tasks: the effect of lexical status. 494-497 - Parham Zolfaghari, Yoshinori Atake, Kiyohiro Shikano, Hideki Kawahara:
Investigation of analysis and synthesis parameters of straight by subjective evaluation. 498-501
Spoken Language Processing
- Andrew N. Pargellis, Alexandros Potamianos:
Cross-domain classification using generalized domain acts. 502-505 - Ganesh N. Ramaswamy, Jan Kleindienst:
Hierarchical feature-based translation for scalable natural language understanding. 506-509 - Alexandros Potamianos, Hong-Kwang Jeff Kuo:
Statistical recursive finite state machine parsing for speech understanding. 510-513 - Chaojun Liu, Yonghong Yan:
Speaker change detection using minimum message length criterion. 514-517 - Sadaoki Furui, Kikuo Maekawa, Hitoshi Isahara, Takahiro Shinozaki, Takashi Ohdaira:
Toward the realization of spontaneous speech recognition - introduction of a Japanese priority program and preliminary results -. 518-521 - Toshiyuki Takezawa, Fumiaki Sugaya, Masaki Naito, Seiichi Yamamoto:
A comparative study on acoustic and linguistic characteristics using speech from human-to-human and human-to-machine conversations. 522-525 - Néstor Becerra Yoma:
Speaker dependent temporal constraints combined with speaker independent HMM for speech recognition in noise. 526-529
Acoustic Features for Robust Speech Recognition
- Yoshihiro Ito, Hiroshi Matsumoto, Kazumasa Yamamoto:
Forward masking on a generalized logarithmic scale for robust speech recognition. 530-533 - Heidi Christensen, Børge Lindberg, Ove Andersen:
Noise robustness of heterogeneous features employing minimum classification error feature space transformations. 534-537 - Michael L. Seltzer, Bhiksha Raj, Richard M. Stern:
Classifier-based mask estimation for missing feature methods of robust speech recognition. 538-541 - Kris Hermus, Werner Verhelst, Patrick Wambacq:
Optimized subspace weighting for robust speech recognition in additive noise environments. 542-545 - Ji Ming, Peter Jancovic, Philip Hanna, Darryl Stewart, Francis Jack Smith:
Robust feature selection using probabilistic union models. 546-549 - Ramalingam Hariharan, Imre Kiss, Olli Viikki, Jilei Tian:
Multi-resolution front-end for noise robust speech recognition. 550-553 - Douglas D. O'Shaughnessy, Marcel Gabrea:
Recognition of digit strings in noisy speech with limited resources. 554-557
Prosody, Acquisition, and Learning
- Keiichi Tajima, Donna Erickson, Kyoko Nagao:
Factors affecting native Japanese speakers' production of intrusive (epenthetic) vowels in English words. 558-561 - Imed Zitouni, Kamel Smaïli, Jean Paul Haton:
Beyond the conventional statistical language models: the variable-length sequences approach. 562-565 - Yasushi Tsubota, Masatake Dantsuji, Tatsuya Kawahara:
Computer-assisted English vowel learning system for Japanese speakers using cross language formant structures. 566-569 - Trym Holter, Erik Harborg, Magne Hallstein Johnsen, Torbjørn Svendsen:
ASR-based subtitling of live TV-programs for the hearing impaired. 570-573 - Chung-Hsien Wu, Yu-Hsien Chiu, Chi-Shiang Guo:
Natural language processing for Taiwanese sign language to speech conversion. 574-577 - Jouji Miwa, Hiroshi Sasaki, Kazunori Tanno:
Japanese spoken language learning system using java information technology. 578-581 - Helmer Strik, Catia Cucchiarini, Diana Binnenpoorte:
L2 pronunciation quality in read and spontaneous speech. 582-585 - Tomoko Kitamura, Keisuke Kinoshita, Takayuki Arai, Akiko Kusumoto, Yuji Murahara:
Designing modulation filters for improving speech intelligibility in reverberant environments. 586-589 - Lei Zhang, Jiqing Han, Chengguo Lv, Chengfa Wang:
An environment model-based robust speech recognition. 590-593 - Jaco Vermaak, Christophe Andrieu, Arnaud Doucet:
Particle filtering for non-stationary speech modelling and enhancement. 594-597 - Martin Graciarena:
Maximum likelihood noise HMMm estimation in model-based robust speech recognition. 598-601 - Qingsheng Zeng, Douglas D. O'Shaughnessy:
Microphone array within a handset or face mask for speech enhancement. 602-605 - Chengfa Wang, Qiusheng Wang:
Embedding visually recognizable watermarks into digital audio signals. 606-609 - Mamoru Iwaki:
Auditory perception of amplitude modulated sinusoid using a pure tone and band-limited noises as modulation signals. 610-613 - Masoud Geravanchizadeh:
Spectral voice conversion based on unsupervised clustering of acoustic space. 614-617 - Hartmut R. Pfitzinger:
Removing hum from spoken language resources. 618-621 - Ingunn Amdal, Filipp Korkmazskiy, Arun C. Surendran:
Joint pronunciation modelling of non-native speakers using data-driven methods. 622-625 - Linda Bell, Robert Eklund, Joakim Gustafson:
A comparison of disfluency distribution in a unimodal and a multimodal speech interface. 626-629 - Yi Liu, Pascale Fung:
Modelling pronunciation variations in spontaneous Mandarin speech. 630-633 - Tadashi Suzuki, Jun Ishii, Kunio Nakajima:
A method of generating English pronunciation dictionary for Japanese English recognition systems. 634-637 - Hélène Bonneau-Maynard, Laurence Devillers:
A framework for evaluating contextual understanding. 638-641 - Yonggang Deng, Taiyi Huang, Bo Xu:
Towards high performance continuous Mandarin digit string recognition. 642-645 - Matthew P. Aylett:
Stochastic suprasegmentals: relationships between redundancy, prosodic structure and care of articulation in spontaneous speech. 646-649 - Masaharu Sakamoto, Takashi Saitoh:
An automatic pitch-marking method using wavelet transform. 650-653 - Keiichi Takamaru, Makoto Hiroshige, Kenji Araki, Koji Tochinai:
A proposal of a model to extract Japanese voluntary speech rate control. 654-657 - Veronika Makarova:
Acoustic characteristics of surprise in Russian questions. 658-661 - Yonggang Deng, Yang Cao, Bo Xu:
Neural network based integration of multiple confidence measures for OOV detection. 662-665 - Yi Xu, Xuejing Sun:
How fast can we really change pitch? maximum speed of pitch change revisited. 666-669 - Esther Klabbers, Jan P. H. van Santen:
Predicting segmental durations for Dutch using the sums-of-products approach. 670-673 - Yang Cao, Taiyi Huang, Bo Xu, Chengrong Li:
A stochastic polynomial tone model for continuous Mandarin speech. 674-677 - Marcel Gabrea, Douglas D. O'Shaughnessy:
Detection of filled pauses in spontaneous conversational speech. 678-681 - Bertil Lyberg, Sonia Sangari:
Some observations on different strategies for the timing of fundamental frequency events. 682-685 - Zhiyong Wu, Lianhong Cai, Tongchun Zhou:
Research on dynamic characters of Chinese pitch contours. 686-689
Adaptation and Acquisition in Spoken Language Processing (Poster)
- Bing Zhao, Bo Xu:
Incorporating HMM-state sequence confusion for rapid MLLR adaptation to new speakers. 690-693 - Zhipeng Zhang, Sadaoki Furui:
An online incremental speaker adaptation method using speaker-clustered initial models. 694-697 - Guoqiang Li, Limin Du, Ziqiang Hou:
Prior parameter transformation for unsupervised speaker adaptation. 698-701 - Ruhi Sarikaya, John H. L. Hansen:
Improved Jacobian adaptation for fast acoustic model adaptation in noisy speech recognition. 702-705 - Keiko Fujita, Yoshio Ono, Yoshihisa Nakatoh:
A study of vocal tract length normalization with generation-dependent acoustic models. 706-709 - Shaojun Wang, Yunxin Zhao:
Optimal on-line Bayesian model selection for speaker adaptation. 710-713 - Bowen Zhou, John H. L. Hansen:
Unsupervised audio stream segmentation and clustering via the Bayesian information criterion. 714-717 - Satoru Tsuge, Toshiaki Fukada, Kenji Kita:
Frame-period adaptation for speaking rate robust speech recognition. 718-721 - Christoph Nieuwoudt, Elizabeth C. Botha:
Cross-language use of acoustic information for automatic speech recognition. 722-725 - Shoei Sato, Toru Imai, Hideki Tanaka, Akio Ando:
Selective training of HMMs by using two-stage clustering. 726-729 - Ángel de la Torre, Dominique Fohr, Jean Paul Haton:
Compensation of noise effects for robust speech recognition in car environments. 730-733 - Dong Kook Kim, Nam Soo Kim:
Bayesian speaker adaptation based on probabilistic principal component analysis. 734-737 - Wai Kat Liu, Pascale Fung:
MLLR-based accent model adaptation without accented data. 738-741 - Kuan-Ting Chen, Wen-Wei Liau, Hsin-Min Wang, Lin-Shan Lee:
Fast speaker adaptation using eigenspace-based maximum likelihood linear regression. 742-745 - Gerasimos Potamianos, Chalapathy Neti:
Stream confidence estimation for audio-visual speech recognition. 746-749 - Masahiko Komatsu, Won Tokuma, Shinichi Tokuma, Takayuki Arai:
The effect of reduced spectral information on Japanese consonant perception: comparison between L1 and L2 listeners. 750-753 - Valter Ciocca, Rani Aisha, Alexander L. Francis, Lena Wong:
Can cantonese children with cochlear implants perceive lexical tones? 754-757 - Michael C. W. Yip:
Recognition of spoken words in the continuous speech: effects of transitional probability. 758-761 - Ariel Salomon, Carol Y. Espy-Wilson:
Detection of speech landmarks using temporal cues. 762-765 - Takashi Otake, Anne Cutler:
A set of Japanese word cohorts rated for relative familiarity. 766-769 - Kimiko Yamakawa, Hiromitsu Miyazono, Ryoji Baba:
The phonetic value of the devocalized vowel in Japanese - in case of velar plosive. 770-773 - James M. McQueen, Anne Cutler, Dennis Norris:
Positive and negative influences of the lexicon on phonemic decision-making. 778-781 - Andrea Weber:
Phonotactic and acoustic cues for word segmentation in English. 782-785 - Esther Janse:
Intelligibility of time-compressed speech: three ways of time-compression. 786-789 - Hartmut Traunmller:
Evidence for demodulation in speech perception. 790-793
Large Vocabulary Continuous Speech Recognition
- Jean-Luc Gauvain, Lori Lamel:
Fast decoding for indexation of broadcast data. 794-797 - Sheng Gao, Bo Xu, Hong Zhang, Bing Zhao, Chengrong Li, Taiyi Huang:
Update progress of Sinohear: advanced Mandarin LVCSR system at NLPR. 798-801 - Xavier L. Aubert, Reinhard Blasig:
Combined acoustic and linguistic look-ahead for one-pass time-synchronous decoding. 802-805 - Li Deng, Alex Acero, Mike Plumpe, Xuedong Huang:
Large-vocabulary speech recognition under adverse acoustic environments. 806-809 - Volker Fischer, Siegfried Kunzmann:
Acoustic language model classes for a large vocabulary continuous speech recognizer. 810-813 - Franz Kummert, Gernot A. Fink, Gerhard Sagerer:
A hybrid speech recognizer combining HMMs and polynomial classification. 814-817 - Chao Huang, Eric Chang, Jianlai Zhou, Kai-Fu Lee:
Accent modeling based on pronunciation dictionary adaptation for large vocabulary Mandarin speech recognition. 818-821
Speech Coding and Transmission
- Jinzhong Zhang, Yingmin He, Renshu Yu:
A mixed and code excitation LPC vocoder at 1.76 kb/s. 822-825 - Minoru Kohata, Ikuya Mitsuya, Motoyuki Suzuki, Shozo Makino:
Efficient segment quantization of LSP parameters for very low bit speech coding. 826-829 - Carlos M. Ribeiro, Isabel Trancoso, Diamantino Caseiro:
Phonetic vocoder assessment. 830-833 - Hongtao Hu, Limin Du:
A new low bit rate speech coder based on intraframe waveform interpolation. 834-837 - Rathinavelu Chengalvarayan, David L. Thomson:
Discriminatively derived HMM-based announcement modeling approach for noise control avoiding the problem of false alarms. 838-841 - Juan M. Huerta, Richard M. Stern:
Instantaneous-distortion based weighted acoustic modeling for robust recognition of coded speech. 842-845
Acoustic Model Adaptation
- Nitendra Rajput, L. Venkata Subramaniam, Ashish Verma:
Adapting phonetic decision trees between languages for continuous speech recognition. 850-852 - Stephen Cox:
Speaker normalization in the MFCC domain. 853-856 - Reinhold Haeb-Umbach:
Data-driven phonetic regression class tree estimation for MLLR adaptation. 857-860 - Mohamed Afify, Olivier Siohan:
Constrained maximum likelihood linear regression for speaker adaptation. 861-864 - Woo-Yong Choi, Hyung Soon Kim:
Predictive speaker adaptation based on least squares method. 865-868 - Alex Acero, Li Deng, Trausti T. Kristjansson, Jerry Zhang:
HMM adaptation using vector taylor series for noisy speech recognition. 869-872 - Dimitra Vergyri, Stavros Tsakalidis, William Byrne:
Minimum risk acoustic clustering for multilingual acoustic model combination. 873-876
Miscellaneous 3 [D, E, F, I, P, N, R, S, U, W, Y, Z]
- Sharon L. Oviatt:
Talking to thimble jellies: children²s conversational speech with animated characters. 877-880 - Robert D. Rodman, David F. McAllister, Donald L. Bitzer, D. Chappell:
A high-resolution glottal pulse tracker. 881-884 - Paavo Alku, Jan G. Svec, Erkki Vilkman, Frantisek Sram:
Analysis of voice production in breathy, normal and pressed phonation by comparing inverse filtering and videokymography. 885-888 - Takayuki Ito, Hiroaki Gomi, Masaaki Honda:
Model of the mechanical linkage of the upper lip-jaw for the articulatory coordination. 889-892 - Masafumi Matsumura, Takuya Niikawa, Taku Torii, Hitoshi Yamasaki, Hisanaga Hara, Takashi Tachimura, Takeshi Wada:
Measurement of palatolingual contact pressure and tongue force using a force-sensor-mounted palatal plate. 893-896 - Olov Engwall:
A 3d tongue model based on MRI data. 901-904 - Jae-Hyun Bae, Heo-Jin Byeon, Yung-Hwan Oh:
Speech quality improvement in TTS system using ABS/OLA sinusoidal model. 905-908 - Marielle Bruyninckx, Bernard Harmegnies:
A study of palatal segments' production by danish speakers. 909-912 - Bhuvana Ramabhadran, Yuqing Gao, Michael Picheny:
Dynamic selection of feature spaces for robust speech recognition. 913-916 - Santiago Fernández, Sergio Feijóo:
A probabilistic model of integration of acoustic cues in FV syllables. 917-920 - Jeff A. Bilmes, Katrin Kirchhoff:
Directed graphical models of classifier combination: application to phone recognition. 921 - Ea-Ee Jan, Jaime Botella Ordinas, George Saon, Salim Roukos:
Real-time multilingual HMM training robust to channel variations. 925-928 - Sander J. van Wijngaarden, Herman J. M. Steeneken:
The intelligibility of German and English speech to Dutch listeners. 929-932 - Bin Zhen, Xihong Wu, Zhimin Liu, Huisheng Chi:
On the use of bandpass liftering in speaker recognition. 933-936 - René Carré, Liliane Sprenger-Charolles, Souhila Messaoud-Galusi, Willy Serniclaes:
On auditory-phonetic short-term transformation. 937-940 - James J. Hant, Abeer Alwan:
Predicting the perceptual confusion of synthetic plosive consonants in noise. 941-944 - Martha A. Larson, Daniel Willett, Joachim Köhler, Gerhard Rigoll:
Compound splitting and lexical unit recombination for improved performance of a speech recognition system for German parliamentary speeches. 945-948 - Martine van Zundert, Jacques M. B. Terken:
Learning and transfer of learning for synthetic speech. 949-952 - Yang Zhang, Patricia K. Kuhl, Toshiaki Imada, Paul Iverson, John Pruitt, Makoto Kotani, Erica Stevens:
Neural plasticity revealed in perceptual training of a Japanese adult listener to learn american /l-r/ contrast: a whole-head magnetoencephalography study. 953-956 - Akiyo Joto:
The effect of consonantal context and acoustic characteristics on the discrimination between the English vowel /i/ and /e/ by Japanese learners. 957-960 - Li Zhao, Wei Lu, Ye Jiang, Zhenyang Wu:
A study on emotional feature recognition in speech. 961-964 - Juan Ignacio Godino-Llorente, Santiago Aguilera-Navarro, Pedro Gómez Vilda:
LPC, LPCC and MFCC parameterisation applied to the detection of voice impairments. 965-968 - Benjamin Ka-Yin T'sou, Tom B. Y. Lai:
A complementary approach to computer-aided transcription: synergy of statistical-based and kbnowledge discovery paradigms. 969-972 - Marie-José Caraty, Claude Montacié:
Teraspeech'2000 : a 10,000 speakers database. 973-976 - Laila Dybkjær, Niels Ole Bernsen:
The MATE workbench - a tool in support of spoken dialogue annotation and information extraction. 977-980 - Armelle Brun, David Langlois, Kamel Smaïli, Jean Paul Haton:
Discarding impossible events from statistical language models. 981-984 - Yves Lepage, Nicolas Auclerc, Satoshi Shirai:
A tool to build a treebank for conversational Chinese. 985-988 - Roland Auckenthaler, Michael J. Carey, John Maso:
Parameter reduction in a text-independent speaker verification system. 989-992 - Yong Gu, Trevor Thomas:
Advances on HMM-based text-dependent speaker verification. 993-996 - Robert P. Stapert, John S. D. Mason, Roland Auckenthaler:
Optimisation of GMM in speaker recognition. 997-1000 - Ran D. Zilca, Yuval Bistritz:
Distance-based Gaussian mixture model for speaker recognition over the telephone. 1001-1004 - Jun-Hui Liu, Ke Chen:
Pruning abnormal data for better making a decision in speaker verification. 1005-1008 - Louis ten Bosch:
ASR, dialects, and acoustic/phonological distances. 1009-1012 - Masafumi Nishida, Yasuo Ariki:
Speaker verification by integrating dynamic and static features using subspace method. 1013-1016 - Su-Hyun Kim, Gil-Jin Jang, Yung-Hwan Oh:
Improvement of speaker recognition system by individual information weighting. 1017-1020 - Néstor Becerra Yoma, Tarciano Facco Pegoraro:
Speaker verification in noise using temporal constraints. 1021-1024 - Bogdan Sabac, Inge Gavat, Zica Valsan:
Speaker identification using discriminative features selection. 1025-1028 - Ivan Magrin-Chagnolleau, Guillaume Gravier, Mouhamadou Seck, Olivier Boëffard, Raphaël Blouet, Frédéric Bimbot:
A further investigation on speech features for speaker characterization. 1029-1032 - Jyotsana Balleda, Hema A. Murthy, T. Nagarajan:
Language identification from short segments of speech. 1033-1036 - Susanne Kronenberg, Franz Kummert:
Generation of utterances based on visual context information. 1037-1040 - Mazin G. Rahim, Roberto Pieraccini, Wieland Eckert, Esther Levin, Giuseppe Di Fabbrizio, Giuseppe Riccardi, Candace A. Kamm, Shrikanth S. Narayanan:
A spoken dialogue system for conference/workshop services. 1041-1044 - Gavin E. Churcher, Peter J. Wyard:
Developing robust, user-centred multimodal spoken language systems: the MUeSLI project. 1045-1048 - Magne Hallstein Johnsen, Torbjørn Svendsen, Tore Amble, Trym Holter, Erik Harborg:
TABOR - a norwegian spoken dialogue system for bus travel information. 1049-1052 - Yinfei Huang, Fang Zheng, Mingxing Xu, Pengju Yan, Wenhu Wu:
Language understanding component for Chinese dialogue system. 1053-1056 - Kazumi Aoyama, Izumi Hirano, Hideaki Kikuchi, Katsuhiko Shirai:
Designing a domain independent platform of spoken dialogue system. 1057-1060 - Qiru Zhou, Antoine Saad, Sherif M. Abdou:
An enhanced BLSTIP dialogue research platform. 1061-1064 - Weidong Qu, Katsuhiko Shirai:
Using machine learning method and subword unit representations for spoken document categorization. 1065-1068 - Litza A. Stark, Steve Whittaker, Julia Hirschberg:
ASR satisficing: the effects of ASR accuracy on speech retrieval. 1069-1072 - Hiromitsu Nishizaki, Seiichi Nakagawa:
A system for retrieving broadcast news speech documents using voice input keywords and similarity between words. 1073-1076 - Yu-Sheng Lai, Kuen-Lin Lee, Chung-Hsien Wu:
Intention extraction and semantic matching for internet FAQ retrieval using spoken language query. 1077-1080 - Robert J. van Vark, Jelle K. de Haan, Léon J. M. Rothkrantz:
A domain-independent model to improve spelling in a web environment. 1081-1084 - Seiichi Takao, Jun Ogata, Yasuo Ariki:
Expanded vector space model based on word space in cross media retrieval of news speech data. 1085-1088 - John H. L. Hansen, Bowen Zhou, Murat Akbacak, Ruhi Sarikaya, Bryan L. Pellom:
Audio stream phrase recognition for a national gallery of the spoken word: "one small step". 1089-1092 - Hideharu Nakajima, Yoshinori Sagisaka, Hirofumi Yamamoto:
Pronunciation variants description using recognition error modeling with phonetic derivation hypotheses. 1093-1096 - Wataru Tsukahara, Nigel Ward:
Evaluating responsiveness in spoken dialog systems. 1097-1100 - Nobuhiko Kitawaki, Futoshi Asano, Takeshi Yamada:
Characteristics of spoken language required for objective quality evaluation of echo cancellers. 1101-1104 - Fumiaki Sugaya, Toshiyuki Takezawa, Akio Yokoo, Yoshinori Sagisaka, Seiichi Yamamoto:
Evaluation of the ATR-matrix speech translation system with a pair comparison method between the system and humans. 1105-1108 - Ichiro Maruyama, Yoshiharu Abe, Terumasa Ehara, Katsuhiko Shirai:
An automatic timing detection method for superimposing closed captions of TV programs. 1109-1112 - Marcel Ogner, Zdravko Kacic:
Normalized time-frequency speech representation in articulation training systems. 1113-1116 - Shinichi Torihara, Katashi Nagao:
Semantic transcoding: making the handicapped and the aged free from their barriers in obtaining information on the web. 1117-1120 - Rathinavelu Chengalvarayan:
The use of nonlinear energy transformation for Tamil connected-digit speech recognition. 1121-1124 - Aimin Chen, Saeed Vaseghi:
State based sub-band Wiener filters for speech enhancement in car environments. 1125-1128 - Kris Hermus, Werner Verhelst, Patrick Wambacq, Philippe Lemmerling:
Total least squares based subband modelling for scalable speech representations with damped sinusoids. 1129-1132 - Joon-Hyuk Chang, Nam Soo Kim:
Speech enhancement: new approaches to soft decision. 1133-1136
Volume 4
Language Resources and Technology Evaluation (Special Session)
- James R. Glass, Joseph Polifroni, Stephanie Seneff, Victor Zue:
Data collection and performance evaluation of spoken dialogue systems: the MIT experience. 1-4 - Lori Lamel, Sophie Rosset, Jean-Luc Gauvain:
Considerations in the design and evaluation of spoken language dialog systems. 5-8 - Martin Heckmann, Frédéric Berthommier, Christophe Savario, Kristian Kroschel:
Labeling audio-visual speech corpora and training an ANN/HMM audio-visual speech recognition system. 9-12 - Aijun Li, Maocan Lin, Xiaoxia Chen, Yiqing Zu, Guohua Sun, Wu Hua, Zhigang Yin, Jingzhu Yan:
Speech corpus of Chinese discourse and the phonetic research. 13-18 - Jonathan G. Fiscus, George R. Doddington:
Results of the 1999 topic detection and tracking evaluation in Mandarin and English. 19-24 - Satoshi Nakamura, Keiko Watanuki, Toshiyuki Takezawa, Satoru Hayamizu:
Multimodal corpora for human-machine interaction research. 25-28 - David Pearce, Hans-Günter Hirsch:
The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions. 29-32 - Hans G. Tillmann, Florian Schiel, Christoph Draxler, Phil Hoole:
The bavarian archive for speech signals - serving the speech community. 33-36 - J. Bruce Millar:
The development of spoken language resources in oceania. 37-40 - Frank K. Soong, Eric A. Woudenberg:
Hands-free human-machine dialogue - corpora, technology and evaluation. 41-44
Acquisition and Learning of Spoken Language 1, 2
- Giuseppe Riccardi:
On-line learning of acoustic and lexical units for domain-independent ASR. 45-48 - Tomoyosi Akiba, Katsunobu Itou:
Semi-automatic language model acquisition without large corpora. 49-52 - Dijana Petrovska-Delacrétaz, Allen L. Gorin, Jerry H. Wright, Giuseppe Riccardi:
Detecting acoustic morphemes in lattices for spoken language understanding. 53-56 - Mitsunori Mizumachi, Masato Akagi, Satoshi Nakamura:
Design of robust subtractive beamformer for noisy speech recognition. 57-60 - Hamid Sheikhzadeh, Rassoul Amirfattahi:
Objective long-term assessment of speech quality changes in pre-lingual cochlear implant children. 61-64 - Elmar Nöth, Heinrich Niemann, Tino Haderlein, M. Decher, Ulrich Eysholdt, Frank Rosanowski, Thomas Wittenberg:
Automatic stuttering recognition using hidden Markov models. 65-68 - Deb Roy:
Grounded speech communication. 69-72 - Sun-Ah Jun, Mira Oh:
Acquisition of second language intonation. 73-76 - Man-Hung Siu, Ka-Ming Wong, Man-Yan Ching, Mei-Sum Lau:
Computer-aided Mandarin pronunciation learning system. 77-80 - Michael F. McTear, Norma Conn, Nicola Phillips:
Speech recognition software: a tool for people with dyslexia. 81-84 - H. Timothy Bunnell, Debra Yarrington, James B. Polikoff:
STAR: articulation training for young children. 85-88
Acoustics of Spoken Language 1, 2
- Takayoshi Nakai, Keizo Ishida, Hisayoshi Suzuki:
Sound pressure distributions and propagation paths in the vocal tract with the pyriform fossa and the larynx. 89-92 - László Czap:
Lip representation by image ellipse. 93-96 - R. J. J. H. van Son, Barbertje M. Streefkerk, Louis C. W. Pols:
An acoustic profile of speech efficiency. 97-100 - Helen M. Meng, Wai Kit Lo, Yuk-Chi Li, P. C. Ching:
Multi-scale audio indexing for Chinese spoken document retrieval. 101-104 - Hagen Soltau, Alex Waibel:
Phone dependent modeling of hyperarticulated effects#. 105-108 - Qing Guo, Yonghong Yan, Baosheng Yuan, Xiangdong Zhang, Ying Jia, Xiaoxing Liu:
Vocabulary-based acoustic model trim down and task adaptation. 109-112 - Willa S. Chen, Abeer Alwan:
Place of articulation cues for voiced and voiceless plosives and fricatives in syllable-initial position. 113-116 - Jingdong Chen, Kuldip K. Paliwal, Satoshi Nakamura:
A block cosine transform and its application in speech recognition. 117-120 - Jeih-Weih Hung, Hsin-Min Wang, Lin-Shan Lee:
Automatic metric-based speech segmentation for broadcast news via principal component analysis. 121-124 - Yuqing Gao, Yongxin Li, Michael Picheny:
Maximal rank likelihood as an optimization function for speech recognition. 125-128 - Yue Pan, Alex Waibel:
The effects of room acoustics on MFCC speech parameter. 129-132 - Mark Hasegawa-Johnson:
Time-frequency distribution of partial phonetic information measured using mutual information. 133-136
Recognition and Understanding of Spoken Language 3, 4
- Li Jiang, Xuedong Huang:
Subword-dependent speaker clustering for improved speech recognition. 137-140 - Chunhua Luo, Fang Zheng, Mingxing Xu:
An equivalent-class based MMI learning method for MGCPM. 141-144 - Alan Wrench, Korin Richmond:
Continuous speech recognition using articulatory data. 145-148 - Brian Kan-Wing Mak, Yik-Cheung Tam:
Asynchrony with trained transition probabilities improves performance in multi-band speech recognition. 149-152 - Sunil Sivadas, Pratibha Jain, Hynek Hermansky:
Discriminative MLPs in HMM-based recognition of speech in cellular telephony. 153-156 - Toshiyuki Hanazawa, Jun Ishii, Yohei Okato, Kunio Nakajima:
Acoustic modeling for spontaneous speech recognition using syllable dependent models. 157-160 - Hui Jiang, Li Deng:
A robust training strategy against extraneous acoustic variations for spontaneous speech recognition. 161-164 - Darryl W. Purnell, Elizabeth C. Botha:
Improved performance and generalization of minimum classification error training for continuous speech recognition. 165-168 - Ying Jia, Yonghong Yan, Baosheng Yuan:
Dynamic threshold setting via Bayesian information criterion (BIC) in HMM training. 169-171 - Thomas Hain, Philip C. Woodland:
Modelling sub-phone insertions and deletions in continuous speech recognition. 172-175 - Carrson C. Fung, Oscar C. Au, Wanggen Wan, Chi H. Yim, Cyan L. Keung:
Improved acoustics modeling for speech recognition using transformation techniques. 176-179 - Liang Gu, Jayanth Nayak, Kenneth Rose:
Discriminative training of tied-mixture HMM by deterministic annealing. 183-186 - Hong-Kwang Jeff Kuo, Chin-Hui Lee:
Discriminative training in natural language call routing. 187-190 - Kazuyo Tanaka, Hiroaki Kojima:
A speech recognition method with a language-independent intermediate phonetic code. 191-194 - Fabrice Lefèvre:
Confidence measures based on the k-nn probability estimator. 195-197 - Niloy Mukherjee, Nitendra Rajput, L. Venkata Subramaniam, Ashish Verma:
On deriving a phoneme model for a new language. 198-201 - Tomonobu Saito, Kiyoshi Hashimoto:
Estimation of semantic case of Japanese dialogue by use of distance derived from statistics of dependency. 202-205 - Stephen Cox, Srinandan Dasmahapatra:
A semantically-based confidence measure for speech recognition. 206-209 - Aravind Ganapathiraju, Joseph Picone:
Support vector machines for automatic data cleanup. 210-213 - Yong Gu, Trevor Thomas:
Competition-based score analysis for utterance verification in name recognition. 214-217 - Yaxin Zhang:
Utterance verification/rejection for speaker-dependent and speaker-independent speech recognition. 218-221
Recognition and Understanding of Spoken Language 3, 4
- Valery A. Petrushin:
Emotion recognition in speech signal: experimental study, development, and application. 222-225 - Ren-Yuan Lyu, Chi-yu Chen, Yuang-Chin Chiang, Min-shung Liang:
A bi-lingual Mandarin/taiwanese (min-nan), large vocabulary, continuous speech recognition system based on the tong-yong phonetic alphabet (TYPA). 226-229 - Ossama Emam, Jorge Gonzalez, Carsten Günther, Eric Janke, Siegfried Kunzmann, Giulio Maltese, Claire Waast-Richard:
A data-driven methodology for the production of multilingual conversational systems. 230-233
Recognition and Understanding of Spoken Language 3, 4
- Tzur Vaich, Arnon Cohen:
Multi-path, context dependent SC-HMM architectures for improved connected word recognition. 234-237 - Yoram Meron, Keikichi Hirose:
Robust recognition using multiple utterances. 238-241 - Piero Cosi, John-Paul Hosom, Fabio Tesser:
High performance Italian continuous "digit" recognition. 242-245 - Dominique Fohr, Odile Mella, Christophe Antoine:
The automatic speech recognition engine ESPERE: experiments on telephone speech. 246-249 - Imre Kiss:
A comparison of distributed and network speech recognition for mobile communication systems. 250-253 - Joe Frankel, Korin Richmond, Simon King, Paul Taylor:
An automatic speech recognition system using neural networks and linear dynamic models to recover and model articulatory traces. 254-257 - Khaldoun Shobaki, John-Paul Hosom, Ronald A. Cole:
The OGI kids² speech corpus and recognizers. 258-261 - Jian Wu, Fang Zheng:
Reducing time-synchronous beam search effort using stage based look-ahead and language model rank based pruning. 262-265 - Grace Chung:
A three-stage solution for flexible vocabulary speech understanding. 266-269 - Jon Barker, Martin Cooke, Daniel P. W. Ellis:
Decoding speech in the presence of other sound sources. 270-273 - Shi-wook Lee, Keikichi Hirose, Nobuaki Minematsu:
Efficient search strategy in large vocabulary continuous speech recognition using prosodic boundary information. 274-277 - Ha-Jin Yu, Hoon Kim, Joon-Mo Hong, Min-Seong Kim, Jong-Seok Lee:
Large vocabulary Korean continuous speech recognition using a one-pass algorithm. 278-281 - Alexander Seward:
A tree-trellis n-best decoder for stochastic context-free grammars. 282-285 - Patrick Nguyen, Luca Rigazio, Jean-Claude Junqua:
EWAVES: an efficient decoding algorithm for lexical tree based speech recognition. 286-289 - Atsunori Ogawa, Yoshiaki Noda, Shoichi Matsunaga:
Novel two-pass search strategy using time-asynchronous shortest-first second-pass beam search. 290-293 - Yu-Chung Chan, Man-Hung Siu, Brian Kan-Wing Mak:
Pruning of state-tying tree using bayesian information criterion with multiple mixtures. 294-297 - Yuan-Fu Liao, Nick J.-C. Wang, Max Huang, Hank Huang, Frank Seide:
Improvements of the Philips 2000 Taiwan Mandarin benchmark system. 298-301 - Christoph Neukirchen, Xavier L. Aubert, Hans Dolfing:
Extending the generation of word graphs for a cross-word m-gram decoder. 302-305 - Qingwei Zhao, Zhiwei Lin, Baosheng Yuan, Yonghong Yan:
Improvements in search algorithm for large vocabulary continuous speech recognition. 306-309 - Hua Yu, Takashi Tomokiyo, Zhirong Wang, Alex Waibel:
New developments in automatic meeting transcription. 310-313 - Jielin Pan, Baosheng Yuan, Yonghong Yan:
Effective vector quantization for a highly compact acoustic model for LVCSR. 318-321 - Hiroki Yamamoto, Toshiaki Fukada, Yasuhiro Komori:
Effective lexical tree search for large vocabulary continuous speech recognition. 322-325 - Chiori Hori, Sadaoki Furui:
Improvements in automatic speech summarization and evaluation methods. 326-329 - Shuangyu Chang, Lokendra Shastri, Steven Greenberg:
Automatic phonetic transcription of spontaneous speech (american English). 330-333 - Miroslav Novak, Michael Picheny:
Speed improvement of the tree-based time asynchronous search. 334-337 - Jing Huang, Brian Kingsbury, Lidia Mangu, Mukund Padmanabhan, George Saon, Geoffrey Zweig:
Recent improvements in speech recognition performance on large vocabulary conversational speech (voicemail and switchboard). 338-341 - Lei He, Ditang Fang, Wenhu Wu:
Speaker normalization training and adaptation for speech recognition. 342-345 - Laura Mayfield Tomokiyo:
Lexical and acoustic modeling of non-native speech in LVSCR. 346-349 - Baojie Li, Keikichi Hirose, Nobuaki Minematsu:
Modeling phone correlation for speaker adaptive speech recognition. 350-353 - Henrik Botterweck:
Very fast adaptation for large vocabulary continuous speech recognition using eigenvoices. 354-357 - Chengyi Zheng, Yonghong Yan:
Efficiently using speaker adaptation data. 358-361 - Thilo Pfau, Robert Faltlhauser, Günther Ruske:
A combination of speaker normalization and speech rate normalization for automatic speech recognition. 362-365 - Tai-Hwei Hwang, Kuo-Hwei Yuo, Hsiao-Chuan Wang:
Speech model compensation with direct adaptation of cepstral variance to noisy environment. 366-369 - Ji Wu, Zuoying Wang:
Gaussian similarity analysis and its application in speaker adaptation. 370-373 - Nobuyasu Itoh, Masafumi Nishimura, Shinsuke Mori:
A method for style adaptation to spontaneous speech by using a semi-linear interpolation technique. 374-377 - Petra Geutner, Luis Arévalo, Joerg Breuninger:
VODIS - voice-operated driver information systems: a usability study on advanced speech technologies for car environments. 378-382 - Wu Chou, Qiru Zhou, Hong-Kwang Jeff Kuo, Antoine Saad, David Attwater, Peter J. Durston, Mark Farrell, Frank Scahill:
Natural language call steering for service applications. 382-385 - Jörg Hunsinger, Manfred K. Lang:
A single-stage top-down probabilistic approach towards understanding spoken and handwritten mathematical formulas. 386-389 - Prabhu Raghavan, Sunil K. Gupta:
Low complexity connected digit recognition for mobile applications. 390-393 - Jan Nouza:
Telephone speech recognition from large lists of Czech words. 394-397 - Duanpei Wu, Xavier Menéndez-Pidal, Lex Olorenshaw, Ruxin Chen, Mick Tanaka, Mariscela Amador:
Speech and word detection algorithms for hands-free applications. 398-401 - Ashwin Rao, Bob Roth, Venkatesh Nagesha, Don McAllaster, Natalie Liberman, Larry Gillick:
Large vocabulary continuous speech recognition of read speech over cellular and landline networks. 402-405
Problems and Prospects of Trans-Lingual Communication (Special Session)
- Seiichi Yamamoto:
Toward speech communications beyond language barrier - research of spoken language translation technologies at ATR -. 406-411 - Hervé Blanchon, Christian Boitet:
Speech translation for French within the c-STAR II consortium and future perspectives. 412-417 - Chengqing Zong, Yumi Wakita, Bo Xu, Zhenbiao Chen, Kenji Matsui:
Japanese-to-Chinese spoken language translation based on the simple expression. 418-421 - Srinivas Bangalore, Giuseppe Riccardi:
Finite-state models for lexical reordering in spoken language translation. 422-425 - Ralf Engel:
CHUNKY: an example based machine translation system for spoken dialogs. 426-429 - Gianni Lazzari:
Spoken translation: challenges and opportunities. 430-435 - Christian Boitet, Jean-Philippe Guilbaud:
Analysis into a formal task-oriented pivot without clear abstract - semantics is best handled as "usual" translation. 436-439 - Chengqing Zong, Taiyi Huang, Bo Xu:
An improved template-based approach to spoken language translation. 440-443 - Takao Watanabe, Akitoshi Okumura, Shinsuke Sakai, Kiyoshi Yamabana, Shinichi Doi, Ken Hanazawa:
An automatic interpretation system for travel conversation. 444-447 - Rainer Gruhn, Harald Singer, Hajime Tsukada, Masaki Naito, Atsushi Nishino, Atsushi Nakamura, Yoshinori Sagisaka, Satoshi Nakamura:
Cellular-phone based speech-to-speech translation system ATR-MATRIX. 448-451
Spoken Language Resources, Labeling, and Assessment
- Nicole Beringer, Tsuyoshi Ito, Marcia Neff:
Generation of pronunciation rule sets for automatic segmentation of American English and Japanese. 452-455 - K. Samudravijaya, P. V. S. Rao, S. S. Agrawal:
Hindi speech database. 456-459 - Hsiao-Chuan Wang, Frank Seide, Chiu-yu Tseng, Lin-Shan Lee:
MAT-2000 - design, collection, and validation of a Mandarin 2000-speaker telephone speech database. 460-463 - Kåre Sjölander, Jonas Beskow:
Wavesurfer - an open source speech tool. 464-467 - Nick Campbell, Toru Marumoto:
Automatic labelling of voice-quality in speech databases for synthesis. 468-471 - Joe Timoney, J. Brian Foley:
Speech quality evaluation based on AM-FM time-frequency representations. 472-475 - Tatsuya Kawahara, Akinobu Lee, Tetsunori Kobayashi, Kazuya Takeda, Nobuaki Minematsu, Shigeki Sagayama, Katsunobu Itou, Akinori Ito, Mikio Yamamoto, Atsushi Yamada, Takehito Utsuro, Kiyohiro Shikano:
Free software toolkit for Japanese large vocabulary continuous speech recognition. 476-479
Robust Modeling
- Qiang Huo, Bin Ma:
Robust speech recognition based on off-line elicitation of multiple priors and on-line adaptive prior fusion. 480-483 - William J. J. Roberts, Sadaoki Furui:
Robust speech recognition via modeling spectral coefficients with HMM's with complex Gaussian components. 484-487 - Mirjam Wester, Judith M. Kessens, Helmer Strik:
Pronunciation variation in ASR: which variation to model? 488-491 - Xiaolong Mou, Victor Zue:
The use of dynamic reliability scoring in speech recognition. 492-495 - Javier Macías Guarasa, Javier Ferreiros, Rubén San Segundo, Juan Manuel Montero, José Manuel Pardo:
Acoustical and lexical based confidence measures for a very large vocabulary telephone speech hypothesis-verification system. 496-499 - Silke Goronzy, Krzysztof Marasek, Ralf Kompe, Andreas Haag:
Phone-duration-based confidence measures for embedded applications. 500-503 - Aravind Ganapathiraju, Jonathan Hamaker, Joseph Picone:
Hybrid SVM/HMM architectures for speech recognition. 504-507
Adaptation and Acquisition in Spoken Language Processing 1, 2
- Koki Sasaki, Hui Jiang, Keikichi Hirose:
Rapid adaptation of n-gram language models using inter-word correlation for speech recognition. 508-511 - Gareth Moore, Steve J. Young:
Class-based language model adaptation using mixtures of word-class weights. 512-515 - Jiasong Sun, Xiaodong Cui, Zuoying Wang, Yang Liu:
A language model adaptation approach based on text classification. 516-519 - Grace Chung:
Automatically incorporating unknown words in JUPITER. 520-523 - Rathinavelu Chengalvarayan:
Look-ahead sequential feature vector normalization for noisy speech recognition. 524-527 - Naoto Iwahashi, Akihiko Kawasaki:
Speaker adaptation in noisy environments based on parameter estimation using uncertain data. 528-531 - Alex Acero, Steven Altschuler, Lani Wu:
Speech/noise separation using two microphones and a VQ model of speech signals. 532-535 - Michiel Bacchiani:
Using maximum likelihood linear regression for segment clustering and speaker identification. 536-539 - Tor André Myrvoll, Olivier Siohan, Chin-Hui Lee, Wu Chou:
Structural maximum a-posteriori linear regression for unsupervised speaker adaptation. 540-543 - Jen-Tzung Chien, Guo-Hong Liao:
Transformation-based Bayesian predictive classification for online environmental learning and robust speech recognition. 544-547 - Michael Pitz, Frank Wessel, Hermann Ney:
Improved MLLR speaker adaptation using confidence measures for conversational speech recognition. 548-551 - Rathinavelu Chengalvarayan:
Unified acoustic modeling for continuous speech recognition. 552-555 - Satya Dharanipragada, Mukund Padmanabhan:
A nonlinear unsupervised adaptation technique for speech recognition. 556-559 - Sam-Joo Doh, Richard M. Stern:
Using class weighting in inter-class MLLR. 560-563
Acoustics of Spoken Language (Poster)
- John-Paul Hosom, Ronald A. Cole:
Burst detection based on measurements of intensity discrimination. 564-567 - Javier Ferreiros López, Daniel P. W. Ellis:
Using acoustic condition clustering to improve acoustic change detection on broadcast news. 568-571 - Jon P. Nedel, Rita Singh, Richard M. Stern:
Phone transition acoustic modeling: application to speaker independent and spontaneous speech systems. 572-575 - Liqin Shen, Guokang Fu, Haixin Chai, Yong Qin:
The measurement of acoustic similarity and its applications. 576-579 - Sopae Yi, Hyung Soon Kim, One Good Lee:
Glottal parameters contributing to the perceotion of loud voices. 580-583 - Christoph Schillo, Gernot A. Fink, Franz Kummert:
Grapheme based speech recognition for large vocabularies. 584-587 - Jon P. Nedel, Rita Singh, Richard M. Stern:
Automatic subword unit refinement for spontaneous speech recognition via phone splitting. 588-591 - Takeshi Tarui:
Rhythm timing in Japanese English. 592-595 - Mamoru Iwaki:
A vocal tract area ratio estimation from spectral parameter extracted by straight. 596-599 - Bhuvana Ramabhadran, Yuqing Gao:
Decision tree based rate of speech modeling for speech recognition. 600-603 - Mukund Padmanabhan:
Spectral peak tracking and its use in speech recognition. 604-607 - Yongxin Li, Yuqing Gao, Hakan Erdogan:
Weighted pairwise scatter to improve linear discriminant analysis. 608-611 - Jindrich Matousek, Josef Psutka:
ARTIC: a new Czech text-to-speech system using statistical approach to speech segment database construction. 612-615 - Wu Chou, Olivier Siohan, Tor André Myrvoll, Chin-Hui Lee:
Extended maximum a posterior linear regression (EMAPLR) model adaptation for speech recognition. 616-619 - Ekkarit Maneenoi, Somchai Jitapunkul, Visarut Ahkuputra, Umavasee Thathong, Boonchai Thampanitchawong, Sudaporn Luksaneeyanawin:
Thai monophthong recognition using continuous density hidden Markov model and LPC cepstral coefficients. 620-623 - Chung-Hsien Wu, Yeou-Jiunn Chen, Cher-Yao Yang:
Error recovery and sentence verification using statistical partial pattern tree for conversational speech. 624-627 - Andrew Wilson Howitt:
Vowel landmark detection. 628-631 - Carsten Meyer, Georg Rose:
Rival training: efficient use of data in discriminative training. 632-635 - Marilyn Y. Chen:
Nasal detection module for a knowledge-based speech recognition system. 636-639 - Jun Liu, Xiaoyan Zhu, Bin Jia:
Semi-continuous segmental probability model for speech signals. 640-643 - Ea-Ee Jan, Jaime Botella Ordinas:
Cross-domain robust acoustic training. 644-647 - Fan Wang, Fang Zheng, Wenhu Wu:
A c/v segmentation method for Mandarin speech based on multiscale fractal dimension. 648-651 - Xiaoxia Chen, Aijun Li, Guohua Sun, Wu Hua, Zhigang Yu:
An application of SAMPA-c for standard Chinese. 652-655
Signal Analysis, Processing, and Feature Extraction
- Wenkai Lu, Xuegong Zhang, Yanda Li, Liqin Shen, Weibin Zhu:
Joint speech signal enhancement based on spectral subtraction and SVD filter. 656-659 - Sacha Krstulovic, Frédéric Bimbot:
Inverse lattice filtering of speech with adapted non-uniform delays. 660-663 - Hideki Kawahara, Yoshinori Atake, Parham Zolfaghari:
Accurate vocal event detection method based on a fixed-point analysis of mapping from time to weighted average group delay. 664-667 - Jun Huang, Mukund Padmanabhan:
Filterbank-based feature extraction for speech recognition and its application to voice mail transcription. 668-671 - Peter J. Murphy:
A cepstrum-based harmonics-to-noise ratio in voice signals. 672-675 - Xuejing Sun:
A pitch determination algorithm based on subharmonic-to-harmonic ratio. 676-679 - Jordi Solé i Casals, Enric Monte-Moreno, Christian Jutten, Anisse Taleb:
Source separation techniques applied to speech linear prediction. 680-683 - Masahide Sugiyama:
Model based voice decomposition method. 684-687 - Keiichi Funaki:
A time-varying complex speech analysis based on IV method. 688-691 - Parham Zolfaghari, Hideki Kawahara:
A sinusoidal model based on frequency-to-instantaneous frequency mapping. 692-695 - Omar Farooq, Sekharjit Datta:
Dynamic feature extraction by wavelet analysis. 696-699 - Montri Karnjanadecha, Stephen A. Zahorian:
An investigation of variable block length methods for calculation of spectral/temporal features for automatic speech recognition. 700-703 - Akira Sasou, Kazuyo Tanaka:
Glottal excitation modeling using HMM with application to robust analysis of speech signal. 704-707 - Laura Docío Fernández, Carmen García-Mateo:
Automatic segmentation of speech based on hidden Markov models and acoustic features. 708-711 - Akira Kurematsu, Youichi Akegami, Susanne Burger, Susanne Jekat, Brigitte Lause, Victoria MacLaren, Daniela Oppermann, Tanja Schultz:
VERBMOBIL dialogues: multifaced analysis. 712-715 - Jin-Jie Zhang, Zhigang Cao, Zhengxin Ma:
A computation-efficient parameter adaptation algorithm for the generalized spectral subtraction method. 716-719 - Masahiro Araki, Kiyoshi Ueda, Takuya Nishimoto, Yasuhisa Niimi:
A semantic tagging tool for spoken dialogue corpus. 720-723 - Aijun Li, Xiaoxia Chen, Guohua Sun, Wu Hua, Zhigang Yin, Yiqing Zu, Fang Zheng, Zhanjiang Song:
The phonetic labeling on read and spontaneous discourse corpora. 724-727 - Nicole Beringer, Florian Schiel:
The quality of multilingual automatic segmentation using German MAUS. 728-731 - Vlasta Radová, Josef Psutka:
UWB_S01 corpus - a czech read-speech corpus. 732-735 - Giuseppe Di Fabbrizio, Shrikanth S. Narayanan:
Web-based monitoring, logging and reporting tools for multi-service multi-modal systems. 736-739 - Helmer Strik, Catia Cucchiarini, Judith M. Kessens:
Comparing the recognition performance of CSRs: in search of an adequate metric and statistical significance test. 740-743 - Alexander Raake:
Perceptual dimensions of speech sound quality in modern transmission systems. 744-747
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.