default search action
EUROSPEECH 1997: Rhodes, Greece
- George Kokkinakis, Nikos Fakotakis, Evangelos Dermatas:
Fifth European Conference on Speech Communication and Technology, EUROSPEECH 1997, Rhodes, Greece, September 22-25, 1997. ISCA 1997
Keynotes
- Mario Rossi:
Is syntactic structure prosodically recoverable? - Victor W. Zue:
Conversational interfaces: advances and challenges. - Jan P. H. van Santen:
Prosodic modelling in text-to-speech synthesis. - Jean-Claude Junqua:
Impact of the unknown communication channel on automatic speech recognition: a review KN-29. - Jerome R. Bellegarda:
Statistical techniques for robust ASR: review and perspectives. - Richard P. Lippmann, Beth A. Carlson:
Using missing feature theory to actively select features for robust speech recognition with interruptions, filtering and noise KN-37.
Acoustic Modelling
- Stéphane Dupont, Hervé Bourlard:
Using multiple time scales in a multi-stream speech recognition system. 3-6 - Yumi Wakita, Harald Singer, Yoshinori Sagisaka:
Speech recognition using HMM-state confusion characteristics. 7-10 - Cristina Chesta, Pietro Laface, Franco Ravera:
Bottom-up and top-down state clustering for robust acoustic modeling. 11-14 - Ralf Schlüter, Wolfgang Macherey, Stephan Kanthak, Hermann Ney, Lutz Welling:
Comparison of optimization methods for discriminative training criteria. 15-18 - Clark Z. Lee, Douglas D. O'Shaughnessy:
Clustering beyond phoneme contexts for speech recognition. 19-22 - Rathinavelu Chengalvarayan:
Influence of outliers in training the parametric trajectory models for speech recognition. 23-26 - Trym Holter, Torbjørn Svendsen:
Incorporating linguistic knowledge and automatic baseform generation in acoustic subword unit based speech recognition. 1159-1162 - Peter Beyerlein, Meinhard Ullrich, Patricia Wilcox:
Modelling and decoding of crossword context dependent phones in the Philips large vocabulary continuous speech recognition system. 1163-1166 - Philip Hanna, Ji Ming, Peter O'Boyle, Francis Jack Smith:
Modelling inter-frame dependence with preceeding and succeeding frames. 1167-1170 - Rhys James Jones, Simon Downey, John S. D. Mason:
Continuous speech recognition using syllables. 1171-1174 - Daniel Willett, Gerhard Rigoll:
A new approach to generalized mixture tying for continuous HMM-based speech recognition. 1175-1178 - Klaus Beulen, Elmar Bransch, Hermann Ney:
State tying for context dependent phoneme models. 1179-1182 - Jacques Duchateau, Kris Demuynck, Dirk Van Compernolle:
A novel node splitting criterion in decision tree construction for semi-continuous HMMs. 1183-1186 - Mats Blomberg:
Creating unseen triphones by phone concatenation in the spectral, cepstral and formant domains. 1187-1190 - Thilo Pfau, Manfred Beham, Wolfgang Reichl, Günther Ruske:
Creating large subword units for speech recognition. 1191-1194 - Jacob Goldberger, David Burshtein, Horacio Franco:
Segmental modeling using a continuous mixture of non-parametric models. 1195-1198 - Jane W. Chang, James R. Glass:
Segmentation and modeling in segment-based recognition. 1199-1202 - Alfred Hauenstein:
Using syllables in a hybrid HMM-ANN recognition system. 1203-1206 - Ramalingam Hariharan, Juha Häkkinen, Kari Laurila, Janne Suontausta:
Noise robust segment-based word recognition using vector quantisation. 1207-1210 - Luis Javier Rodríguez, M. Inés Torres:
Viterbi based splitting of phoneme HMM's. 1211-1214 - José B. Mariño, Albino Nogueiras, Antonio Bonafonte:
The demiphone: an efficient subword unit for continuous speech recognition. 1215-1218 - Hiroaki Kojima, Kazuyo Tanaka:
Organizing phone models based on piecewise linear segment lattices of speech samples. 1219-1222 - Ivica Rogina:
Automatic architecture design by likelihood-based context clustering with crossvalidation. 1223-1226 - Sam T. Roweis, Abeer Alwan:
Towards articulatory speech recognition: learning smooth maps to recover articulator information. 1227-1230 - Anastasios Tsopanoglou, Nikos Fakotakis:
Selection of the most effective set of subword units for an HMM-based speech recognition system. 1231-1234 - Christophe Cerisara, Jean Paul Haton, Jean-François Mari, Dominique Fohr:
Multi-band continuous speech recognition. 1235-1238 - Nabil N. Bitar, Carol Y. Espy-Wilson:
The design of acoustic parameters for speaker-independent speech recognition. 1239-1242
Dynamic Articulatory Measurements
- Laurence Candille, Henri Meloni:
Adaptation of natural articulatory movements to the control of the command parameters of a production model. 27-30 - Maureen L. Stone, Andrew J. Lundberg, Edward P. Davis, Rao P. Gullapalli, Moriel NessAiver:
Three-dimensional coarticulatory strategies of tongue movement. 31-34 - Nathalie Parlangeau, Régine André-Obrecht:
From laryngographic and acoustic signals to voicing gestures. 35-38 - Erkki Vilkman, Raija Takalo, Taisto Maatta, Anne-Maria Laukkanen, Jaana Nummenranta, Tero Lipponen:
Ultrasonographic measurement of cricothyroid space in speech. 39-42 - Didier Demolin, Martine George, Véronique Lecuit, Thierry Metens, Alain Soquet, H. Raeymaekers:
Coarticulation and articulatory compensations studied by dynamic MRI. 43-46 - Pierre Badin, Enrico Baricchi, Anne Vilain:
Determining tongue articulation: from discrete fleshpoints to continuous shadow. 47-50
Language Identification
- Marc A. Zissman:
Predicting, diagnosing and improving automatic language identification performance. 51-54 - Cristobal Corredor-Ardoy, Jean-Luc Gauvain, Martine Adda-Decker, Lori Lamel:
Language identification with language-independent acoustic models. 55-58 - Eluned S. Parris, Harvey Lloyd-Thomas, Michael J. Carey, Jerry H. Wright:
Bayesian methods for language verification. 59-62 - HingKeung Kwan, Keikichi Hirose:
Use of recurrent network for unknown language rejection in language identification system. 63-66 - Ove Andersen, Paul Dalsgaard:
Language-identification based on cross-language acoustic models and optimised information combination. 67-70
Neural Networks for Speech and Language Processing
- Mazin Rahim, Yoshua Bengio, Yann LeCun:
Discriminative feature and model design for automatic speech recognition. 75-78 - Jörg Rottland, Christoph Neukirchen, Daniel Willett, Gerhard Rigoll:
Large vocabulary speech recognition with context dependent MMI-connectionist / HMM systems using the WSJ database. 79-82 - Thierry Moudenc, Guy Mercier:
Automatic selection of segmental acoustic parameters by means of neural-fuzzy networks for reordering the n-best HMM hypotheses. 83-86 - Mikko Kurimo:
Comparison results for segmental training algorithms for mixture density HMMs. 87-90 - M. Asunción Castaño, Francisco Casacuberta:
A connectionist approach to machine translation. 91-94 - Nicolas Pican, Jean-François Mari, Dominique Fohr:
Continuous speech recognition using a context sensitive ANN and HMM2s. 95-98
Training Techniques; Efficient Decoding in ASR
- Koichi Shinoda, Takao Watanabe:
Acoustic modeling based on the MDL principle for speech recognition. 99-102 - Piyush Modi, Mazin Rahim:
Discriminative utterance verification using multiple confidence measures. 103-106 - Enrico Bocchieri, Brian Mak:
Subspace distribution clustering for continuous observation density hidden Markov models. 107-110 - Harriet J. Nock, Mark J. F. Gales, Steve J. Young:
A comparative study of methods for phonetic decision-tree state clustering. 111-114 - Alfred Kaltenmeier, Jürgen Franke:
Comparing Gaussian and polynomial classification in SCHMM-based recognition systems. 115-118 - Alexandre Girardi, Harald Singer, Kiyohiro Shikano, Satoshi Nakamura:
Maximum likelihood successive state splitting algorithm for tied-mixture HMNET. 119-122 - Erik McDermott, Shigeru Katagiri:
String-level MCE for continuous phoneme recognition. 123-126 - Zeév Rivlin, Ananth Sankar, Harry Bratt:
HMM state clustering across allophone class boundaries. 127-130 - Mehryar Mohri, Michael Riley:
Weighted determinization and minimization for large vocabulary speech recognition. 131-134 - Steven J. Phillips, Anne Rogers:
Parallel speech recognition. 135-138 - Stefan Ortmanns, Thorsten Firzlaff, Hermann Ney:
Fast likelihood computation methods for continuous mixture densities in large vocabulary speech recognition. 139-142 - Kris Demuynck, Jacques Duchateau, Dirk Van Compernolle:
A static lexicon network representation for cross-word context dependent phones. 143-146 - Mukund Padmanabhan, Lalit R. Bahl, David Nahamoo, Pieter de Souza:
Decision-tree based quantization of the feature space of a speech recognizer. 147-150 - Mosur Ravishankar, Roberto Bisiani, Eric H. Thayer:
Sub-vector clustering to improve memory and speed performance of acoustic likelihood computation. 151-154 - Simon Hovell:
The incorporation of path merging in a dynamic network recogniser. 155-158 - Miroslav Novak:
Improvement on connected digits recognition using duration constraints in the asynchronous decoding scheme. 159-162 - Andreas Stolcke, Yochai Konig, Mitchel Weintraub:
Explicit word error minimization in n-best list rescoring. 163-166 - Long Nguyen, Richard M. Schwartz:
Efficient 2-pass n-best decoder. 167-170 - Tomohiro Iwasaki, Yoshiharu Abe:
A memory management method for a large word network. 171-174
Prosody
- Antonio Romano:
Persistence of prosodic features between dialectal and standard Italian utterances in six sub-varieties of a region of southern Italy (salento): first assessments of the results of a recognition test and an instrumental analysis. 175-178 - Halewijn Vereecken, Annemie Vorstermans, Jean-Pierre Martens, Bert Van Coile:
Improving the phonetic annotation by means of prosodic phrasing. 179-182 - Cecilia Odé:
A descriptive study of prosodic phenomena in Mpur (west Papuan Phylum). 183-186 - Hansjörg Mixdorff, Hiroya Fujisaki:
Automated quantitative analysis of F0 contours of utterances from a German ToBI-labeled speech database. 187-190 - Stéphanie de Tournemire:
Identification and automatic generation of prosodic contours for a text-to-speech synthesis system in French. 191-194 - Jinfu Ni, Ren-Hua Wang, Keikichi Hirose:
Quantitative analysis and formulation of tone concatenation in Chinese F0 contours. 195-198 - Christel Brindöpke, Arno Pahde, Franz Kummert, Gerhard Sagerer:
An environment for the labelling and testing of melodic aspects of speech. 199-202 - David Casacuberta, Lourdes Aguilar, Rafael Marín:
PROPAUSE: a syntactico-prosodic system designed to assign pauses. 203-206 - Volker Warnke, Ralf Kompe, Heinrich Niemann, Elmar Nöth:
Integrated dialog act segmentation and classification using prosodic features and language models. 207-210 - Monique E. van Donzel, Florien J. Koopmans-van Beinum:
Evaluation of prosodic characteristics in retold stories in Dutch by means of semantic scales. 211-214 - Gösta Bruce, Marcus Filipsson, Johan Frid, Björn Granström, Kjell Gustafson, Merle Horne, David House:
Text-to-intonation in spontaneous Swedish. 215-218 - Yann Morlec, Gérard Bailly, Véronique Aubergé:
Synthesising attitudes with global rhythmic and intonation contours. 219-222 - Dafydd Gibbon, Claudia Sassen:
Prosody-particle pairs as discourse control signs. 223-226 - Anja Elsner:
Focus detection with additional information of phrase boundaries and sentence mode. 227-230 - Laura Bosch, Núria Sebastián-Gallés:
The role of prosody in infants' native-language discrimination abilities: the case of two phonologically close languages. 231-234 - Eugene H. Buder, Anders Eriksson:
Prosodic cycles and interpersonal synchrony in American English and Swedish. 235-238 - Eva Strangert:
Relating prosody to syntax: boundary signalling in Swedish. 239-242 - Mitsuru Nakai, Hiroshi Shimodaira:
On representation of fundamental frequency of speech for prosody analysis using reliability function. 243-246 - Seong-Hwan Kim, Jin-Young Kim:
Efficient method of establishing words tone dictionary for Korean TTS system. 247-250 - Mariapaola D'Imperio, David House:
Perception of questions and statements in Neapolitan Italian. 251-254
Keyword and Topic Spotting
- Qiguang Lin, David M. Lubensky, Michael Picheny, P. Srinivasa Rao:
Key-phrase spotting using an integrated language model of n-grams and finite-state grammar. 255-258 - Jochen Junkawitsch, Günther Ruske, Harald Höge:
Efficient methods for detecting keywords in continuous speech. 259-262 - Raymond Lau, Stephanie Seneff:
Providing sublexical constraints for word spotting within the ANGIE framework. 263-266 - Katarina Bartkova, Denis Jouvet:
Usefulness of phonetic parameters in a rejection procedure of an HMM-based speech recognition system. 267-270 - Yoichi Yamashita, Riichiro Mizoguchi:
Keyword spotting using F0 contour matching. 271-274 - Elmar Nöth, Stefan Harbeck, Heinrich Niemann, Volker Warnke:
A frame and segment based approach for topic spotting. 275-278
Robustness in Recognition and Signal Processing
- Kuldip K. Paliwal, Yoshinori Sagisaka:
Cyclic autocorrelation-based linear prediction analysis of speech. 279-282 - Ilija Zeljkovic, Shrikanth S. Narayanan:
Novel filler acoustic models for connected digit recognition. 283-286 - Makoto Shozakai, Satoshi Nakamura, Kiyohiro Shikano:
A non-iterative model-adaptive e-CMN/PMC approach for speech recognition in car environments. 287-290 - Ángel de la Torre, Antonio M. Peinado, Antonio J. Rubio, Pedro García-Teodoro:
Discriminative feature extraction for speech recognition in noise. 291-294 - Michael K. Brendborg, Børge Lindberg:
Noise robust recognition using feature selective modeling. 295-298 - Victor Abrash:
Mixture input transformations for adaptation of hybrid connectionist speech recognizers. 299-302 - Tai-Hwei Hwang, Lee-Min Lee, Hsiao-Chuan Wang:
Adaptation of time differentiated cepstrum for noisy speech recognition. 1075-1078 - Noboru Kanedera, Takayuki Arai, Hynek Hermansky, Misha Pavel:
On the importance of various modulation frequencies for speech recognition. 1079-1082 - Wei-Tyng Hong, Sin-Horng Chen:
A robust RNN-based pre-classification for noisy Mandarin speech recognition. 1083-1086 - Mazin Rahim:
A parallel environment model (PEM) for speech recognition and adaptation. 1087-1090 - Volker Schless, Fritz Class:
Adaptive model combination for robust speech recognition in car environments. 1091-1094 - Stefaan Van Gerven, Fei Xie:
A comparative study of speech detection methods. 1095-1098 - Nikos Doukas, Patrick A. Naylor, Tania Stathaki:
Voice activity detection using source separation techniques. 1099-1102 - Tomohiko Taniguchi, Shoji Kajita, Kazuya Takeda, Fumitada Itakura:
Voice activity detection using source separation techniques. 1103-1106 - Carlos Avendaño, Sangita Tibrewala, Hynek Hermansky:
Multiresolution channel normalization for ASR in reverberant environments. 1107-1110 - Rafael Martínez, Agustín Álvarez Marquina, Pedro Gómez Vilda, Mercedes Pérez, Victor Nieto Lluis, Victoria Rodellar:
A speech pre-processing technique for end-point detection in highly non-stationary environments. 1111-1114 - Laura Docío Fernández, Carmen García-Mateo:
Application of several channel and noise compensation techiques for robust speaker recognition. 1115-1118 - Hany Agaiby, Thomas J. Moir:
Knowing the wheat from the weeds in noisy speech. 1119-1122 - Do Yeong Kim, Nam Soo Kim, Chong Kwan Un:
Model-based approach for robust speech recognition in noisy environements with multiple noise sources. 1123-1126 - Y. C. Chu, Charlie Jie, Vincent Tung, Ben Lin, Richard Lee:
Normalization of speaker variability by spectrum warping for robust speech recognition. 1127-1130 - Stéphane H. Maes:
LPC poles tracker for music/speech/noise segmentation and music cancellation. 1131-1134 - Doh-Suk Kim, Jae-Hoon Jeong, Soo-Young Lee, Rhee Man Kil:
Comparative evaluations of several front-ends for robust speech recognition. 1135-1138 - Evandro B. Gouvêa, Richard M. Stern:
Speaker normalization through formant-based warping of the frequency scale. 1139-1142 - Martin Westphal:
The use of cepstral means in conversational speech recognition. 1143-1146 - Juan M. Huerta, Richard M. Stern:
Compensation for environmental and speaker variability by normalization of pole locations. 1147-1150 - Jean-Baptiste Puel, Régine André-Obrecht:
Cellular phone speech recognition: noise compensation vs. robust architectures. 1151-1154 - Tung-Hui Chiang:
Speech recognition in noise using on-line HMM adaptation. 1155-1158
Modelling of Prosody
- Christos Malliopoulos, George K. Mikros:
Metrical representations of demarcation and constituency in noun phrases. 303-306 - Hannes Pirker, Kai Alter, Erhard Rank, John Matiasek, Harald Trost, Gernot Kubin:
A system of stylized intonation contours in German. 307-310 - Keikichi Hirose, Koji Iwano:
A method of representing fundamental frequency contours of Japanese using statistical models of moraic transition. 311-314 - Evita F. Fotinea, Michael A. Vlahakis, George Carayannis:
Modeling arbitrarily long sentence-Spanning F0 contours by parametric concatenation of word-Spanning patterns. 315-318 - R. J. J. H. van Son, Jan P. H. van Santen:
Strong interaction between factors influencing consonant duration. 319-322 - Jerneja Gros, Nikola Pavesic, France Mihelic:
Speech timing in Slovenian TTS. 323-326
Microphone Arrays for Speech Enhancement
- Matthias Dörbecker:
Small microphone arrays with optimized directivity for speech enhancement. 327-330 - Masaaki Inoue, Satoshi Nakamura, Takeshi Yamada, Kiyohiro Shikano:
Microphone array design measures for hands-free speech recognition. 331-334 - Masato Akagi, Mitsunori Mizumachi:
Noise reduction by paired microphones. 335-338 - Djamila Mahmoudi:
A microphone array for speech enhancement using multiresolution wavelet transform. 339-342 - Yoshifumi Nagata, Hiroyuki Tsuboi:
A two-channel adaptive microphone array with target tracking. 343-346 - Diego Giuliani, Marco Matassoni, Maurizio Omologo, Piergiorgio Svaizer:
Use of different microphone array configurations for hands-free speech recognition in noisy and reverberant environment. 347-350
Multilingual Recognition
- Chao Wang, James R. Glass, Helen M. Meng, Joseph Polifroni, Stephanie Seneff, Victor W. Zue:
YINHE: a Mandarin Chinese version of the GALAXY system. 351-354 - Fuliang Weng, Harry Bratt, Leonardo Neumeyer, Andreas Stolcke:
A study of multilingual speech recognition. 359-362 - Jayadev Billa, Kristine W. Ma, John W. McDonough, George Zavaliagkos, David R. Miller, Kenneth N. Ross, Amro El-Jaroudi:
Multilingual speech recognition: the 1996 byblos callhome system. 363-366 - Tanja Schultz, Detlef Koll, Alex Waibel:
Japanese LVCSR on the spontaneous scheduling task with JANUS-3. 367-370 - Tanja Schultz, Alex Waibel:
Fast bootstrapping of LVCSR systems with multilingual phoneme sets. 371-374
Language Specific Speech Analysis
- Bernd Pompino-Marschall, Christine Mooshammer:
Factors of variation in the production of the German dorsal fricative. 375-378 - Kimberly Thomas:
EPG and aerodynamic evidence for the coproduction and coarticulation of clicks in Isizulu. 379-382 - Anja Geumann:
Formant trajectory dynamics in Swabian diphthongs. 383-386 - Sidney A. J. Wood:
The gestural organization of vowels and consonants: a cinefluorographic study of articulator gestures in Greenlandic. 387-388 - Victoria B. Anderson:
The perception of coronals in Western Arrernte. 389-392 - Carol Y. Espy-Wilson, Shrikanth S. Narayanan, Suzanne Boyce, Abeer Alwan:
Acoustic modelling of American English /r/. 393-396
Feature Estimation, Pitch, and Prosody
- Anya Varnich Hansen:
Acoustic parameters optimised for recognition of phonetic features. 397-400 - Andrew K. Halberstadt, James R. Glass:
Heterogeneous acoustic measurements for phonetic classification 1. 401-404 - Ben P. Milner:
Cepstral-time matrices and LDA for improved connected digit and sub-word recognition accuracy. 405-408 - Sarel van Vuuren, Hynek Hermansky:
Data-driven design of RASTA-like filters. 409-412 - Simon Nicholson, Ben P. Milner, Stephen J. Cox:
Evaluating feature set performance using the f-ratio and j-measures. 413-416 - Javier Hernando, Climent Nadeu:
Robust speech parameters located in the frequency domain. 417-420 - François Gaillard, Frédéric Berthommier, Gang Feng, Jean-Luc Schwartz:
A modified zero-crossing method for pitch detection in presence of interfering sources. 445-448 - Jacques Simonin, Chafic Mokbel:
Using simulated annealing expectation maximization algorithm for hidden Markov model parameters estimation. 449-452 - Gunnar Fant, Stellan Hertegard, Anita Kruckenberg, Johan Liljencrants:
Covariation of subglottal pressure, F0 and glottal parameters. 453-456 - Anastasios Delopoulos, Maria Rangoussi:
The fractal behaviour of unvoiced plosives: a means for classification. 457-460 - Sumio Ohno, Hiroya Fujisaki, Hideyuki Taguchi:
A method for analysis of the local speech rate using an inventory of reference units. 461-464 - Hiroya Fujisaki, Sumio Ohno, Takashi Yagi:
Analysis and modeling of fundamental frequency contours of Greek utterances. 465-468 - Fernando Martinez, Daniel Tapias, Jorge Alvarez, Paloma Leon:
Characteristics of slow, average and fast speech and their effects in large vocabulary continuous speech recognition. 469-472 - Sungbok Lee, Alexandros Potamianos, Shrikanth S. Narayanan:
Analysis of children's speech: duration, pitch and formants. 473-476 - Hartmut Traunmller, Anders Eriksson:
A method of measuring formant frequencies at high fundamental frequencies. 477-480 - Tom Brøndsted, Jens Printz Madsen:
Analysis of speaking rate variations in stress-timed languages. 481-484 - Paul Micallef, Ted Chilton:
Automatic identification of phoneme boundaries using a mixed parameter model. 485-488 - Serguei Koval, Veronika Bekasova, Mikhail Khitrov, Andrey N. Raev:
Pitch detection reliability assessment for forensic applications. 489-492 - Zhihong Hu, Etienne Barnard:
Efficient estimation of perceptual features for speech recognition. 493-496 - Narendranath Malayath, Hynek Hermansky, Alexander Kain:
Towards decomposing the sources of variability in speech. 497-500 - Rathinavelu Chengalvarayan:
Use of vector-valued dynamic weighting coefficients for speech recognition: maximum likelihood approach. 501-504 - Steve W. Beet, Ladan Baghai-Ravary:
Automatic segmentation: data-driven units of speech. 505-508 - Dejan Bajic:
On robust time-varying AR speech analysis based on t-distribution. 509-512 - Dimitris Tambakas, Iliana Tzima, Nikos Fakotakis, George Kokkinakis:
A simple phoneme energy model for the Greek language and its application to speech recognition. 513-516 - James E. H. Noad, Sandra P. Whiteside, Phil D. Green:
A macroscopic analysis of an emotional speech corpus. 517-520 - Hiroshi Shimodaira, Mitsuru Nakai, Akihiro Kumata:
Restoration of pitch pattern of speech based on a pitch generation model. 521-524 - A. V. Agranovski, O. Y. Berg, D. A. Lednov:
The research of correlation between pitch and skin galvanic reaction at change of human emotional state. 525-528 - Claude Montacié, Marie-José Caraty, Fabrice Lefèvre:
K-NN versus Gaussian in HMM-based recognition system. 529-532 - Boris Doval, Christophe d'Alessandro, Benoit Diard:
Spectral methods for voice source parameters estimation. 533-536
Speech Coding
- Olivier van der Vrecken, Nicolas Pierret, Thierry Dutoit, Vincent Pagel, Fabrice Malfrère:
A simple and efficient algorithm for the compression of MBROLA segment databases. 421-424 - Parham Zolfaghari, Tony Robinson:
A segmental formant vocoder based on linearly varying mixture of Gaussians. 425-428 - Samir Chennoukh, Daniel J. Sinder, Gaël Richard, James L. Flanagan:
Voice mimic system using an articulatory codebook for estimation of vocal tract shape. 429-432 - Damith J. Mudugamuwa, Alan B. Bradley:
Adaptive transform coding for linear predictive residual. 433-436 - Akira Takahashi, Nobuhiko Kitawaki, Paolino Usai, David Atkinson:
Performance evaluation of objective quality measures for coded speech. 437-440 - Mohamed Ismail, Keith Ponting:
Between recognition and synthesis - 300 bits/second speech coding. 441-444 - Stephane Villette, Milos Stefanovic, Ian A. Atkinson, Ahmet M. Kondoz:
High quality split-band LPC vocoder and its fixed point real time implementation. 1243-1246 - Wen-Whei Chang, Hwai-Tsu Chang, Wan-Yu Meng:
Missing packet recovery techniques for DM coded speech. 1247-1250 - Hai Le Vu, László Lois:
Spectral sensitivity of LSP parameters and their transformed coefficients. 1251-1254 - V. Ramasubramanian, Kuldip K. Paliwal:
Reducing the complexity of the LPC vector quantizer using the k-d tree search algorithm. 1255-1258 - Aweke N. Lemma, W. Bastiaan Kleijn, Ed F. Deprettere:
Quantization using wavelet based temporal decomposition of the LSF. 1259-1262 - Costas S. Xydeas, Gokhan H. Ilk:
A novel 1.7/2.4 kb/s DCT based prototype interpolation speech coding system. 1263-1266 - Yong-Soo Choi, Hong-Goo Kang, Sang-Wook Park, Jae-Ha Yoo, Dae Hee Youn:
Improved regular pulse VSELP coding of speech at low bit-rates. 1267-1270 - Yong Duk Cho, Hong Kook Kim, Moo Young Kim, Sang Ryong Kim:
Joint estimation of pitch, band magnitudes, and v\UV decisions for MBE vocoder. 1271-1274 - Balázs Kövesi, Samir Saoudi, Jean-Marc Boucher, Gábor Horváth:
A new distance measure in LPC coding: application for real time situations. 1275-1278 - Peter Veprek, Alan B. Bradley:
Consideration of processing strategies for very-low-rate compression of wideband speech signals with known text transcription. 1279-1282 - Norbert Görtz:
Zero-redundancy error protection for CELP speech codecs. 1283-1286 - Ridha Matmti, Milan Jelinek, Jean-Pierre Adoul:
Low bit rate speech coding using an improved HSX model. 1287-1290 - Carlos M. Ribeiro, Isabel Trancoso:
Phonetic vocoding with speaker adaptation. 1291-1294 - Geneviève Baudoin, Jan Cernocký, Gérard Chollet:
Quantization of spectral sequences using variable length spectral segments for speech coding at very low bit rate. 1295-1298 - Shahrokh Ghaemmaghami, Mohamed A. Deriche, Boualem Boashash:
On modeling event functions in temporal decomposition based speech coding. 1299-1302 - Soledad Torres, Francisco Javier Casajús-Quirós:
Phase quantization by pitch-cycle waveform coding in low bit rate sinusoidal coders. 1303-1306 - Antonis Botinis, Marios Fourakis, John W. Hawks:
A perceptual study of the greek vowel space using synthetic stimuli. 1307-1310 - Woo-Jin Han, Sung-Joo Kim, Yung-Hwan Oh:
Mixed multi-band excitation coder using frequency domain mixture function (FDMF) for a low-bit rate speech coding. 1311-1314 - Tim Fingscheidt, Olaf Scheufen:
Robust GSM speech decoding using the channel decoder's soft output. 1315-1318 - Carl W. Seymour, Tony A. Robinson:
A low-bit-rate speech coder using adaptive line spectral frequency prediction 1319. 1319-1322
Speech Synthesis Techniques
- Wen Ding, Nick Campbell:
Optimising unit selection with voice source and formants in the CHATR speech synthesis system. 537-540 - Masanobu Abe, Hideyuki Mizuno, Satoshi Takahashi, Shin'ya Nakajima:
A new framework to provide high-controllability speech signal and the development of a workbench for it. 541-544 - Eduardo Rodríguez Banga, Carmen García-Mateo, Xavier Fernández Salgado:
Shape-invariant prosodic modification algorithm for concatenative text-to-speech synthesis. 545-548 - Shaw-Hwa Hwang, Sin-Horng Chen, Saga Chang:
An RNN-based spectral information generation for Mandarin text-to-speech. 549-552 - Jan P. H. van Santen, Adam L. Buchsbaum:
Methods for optimal text selection. 553-556 - Francisco M. Gimenez de los Galanes, David Talkin:
High resolution prosody modification for speech synthesis. 557-560 - Orhan Karaali, Gerald Corrigan, Ira A. Gerson, Noel Massey:
Text-to-speech conversion with neural networks: a recurrent TDNN approach. 561-564 - Jesper Högberg:
Data driven formant synthesis. 565-568 - Simon King, Thomas Portele, Florian Höfer:
Speech synthesis using non-uniform units in the Verbmobil project. 569-572 - Isabel Trancoso, Céu Viana:
On the pronunciation mode of acronyms in several European languages. 573-576 - Toni C. M. Rietveld, Joop Kerkhoff, M. J. W. M. Emons, E. J. Meijer, Angelien Sanderman, Agaath M. C. Sluijter:
Evaluation of speech synthesis systems for Dutch in tele-communication applications in GSM and PSTN networks. 577-580 - Bianca Angelini, Claudia Barolo, Daniele Falavigna, Maurizio Omologo, Stefano Sandri:
Automatic diphone extraction for an Italian text-to-speech synthesis system. 581-584 - Eric Keller:
Simplification of TTS architecture vs. operational quality. 585-588 - Georg Fries, Antje Wirth:
Felix - a TTS system with improved pre-processing and source signal generation. 589-592 - Mike Edgington:
Investigating the limitations of concatenative synthesis. 593-596 - Luis Miguel Teixeira de Jesus, Gavin C. Cawley:
Speech coding and synthesis using parametric curves. 597-600 - Alan W. Black, Paul Taylor:
Automatically clustering similar units for unit selection in speech synthesis. 601-604 - Li Jiang, Hsiao-Wuen Hon, Xuedong Huang:
Improvements on a trainable letter-to-sound converter. 605-608 - Myungjin Bae, Kyuhong Kim, Woncheol Lee:
On a cepstral pitch alteration technique for prosody control in the speech synthesis system with high quality. 609-612 - Yannis Stylianou, Thierry Dutoit, Juergen Schroeter:
Diphone concatenation using a harmonic plus noise model of speech. 613-616
Technology for S&L Acquisition, Speech Processing Tools
- Gérard Sabah:
The "sketchboard": a dynamic interpretative memory and its use for spoken language understanding. 617-620 - Qiru Zhou, Chin-Hui Lee, Wu Chou, Andrew N. Pargellis:
Speech technology integration and research platform: a system study. 621-624 - Dieter Geller, Markus Lieb, Wolfgang Budde, Oliver Muelhens, Manfred Zinke:
Speech recognition on SPHERIC - an IC for command and control applications. 625-628 - Michael K. McCandless, James R. Glass:
MUSE: a scripting language for the development of interactive speech analysis and recognition tools. 629-632 - Silke M. Witt, Steve J. Young:
Language learning based on non-native speech recognition. 633-636 - Ute Kilian, Klaus Bader:
Task modelling by sentence templates. 637-640 - Shigeyoshi Kitazawa, Hideya Ichikawa, Satoshi Kobayashi, Yukihiro Nishinuma:
Extraction and representation rhythmic components of spontaneous speech. 641-644 - Yoon Kim, Horacio Franco, Leonardo Neumeyer:
Automatic pronunciation scoring of specific phone segments for language instruction. 645-648 - Orith Ronen, Leonardo Neumeyer, Horacio Franco:
Automatic detection of mispronunciation for language instruction. 649-652 - Agustín Álvarez Marquina, Rafael Martínez, Victor Nieto Lluis, Victoria Rodellar, Pedro Gómez:
Continuous formant-tracking applied to visual representations of the speech and speech recognition. 653-656 - Goh Kawai, Keikichi Hirose:
A CALL system using speech recognition to train the pronunciation of Japanese long vowels, the mora nasal and mora obstruents. 657-660 - Jan Nouza, Miroslav Holada, Daniel Hajek:
An educational and experimental workbench for visual processing of speech data. 661-664 - Yong-Soo Choi, Hong-Goo Kang, Sung-Youn Kim, Young-Cheol Park, Dae Hee Youn:
A 3 channel digital CVSD bit-rate conversion system using a general purpose DSP. 665-668 - Rodolfo Delmonte, Mirela Petrea, Ciprian Bacalu:
SLIM prosodic module for learning activities in a foreign language. 669-672 - Bernhard Kaspar, Karlheinz Schuhmacher, Stefan Feldes:
Barge-in revised. 673-676 - Mohammad Akbar:
Waveedit, an interactive speech processing environment for microsoft windows platform. 677-680 - Farzad Ehsani, Jared Bernstein, Amir Najmi, Ognjen Todic:
Subarashii: Japanese interactive spoken language education. 681-684 - David Goddeau, William Goldenthal, Chris Weikart:
Deploying speech applications over the web. 685-688 - Johan Schalkwyk, Jacques de Villiers, Sarel van Vuuren, Pieter J. E. Vermeulen:
CSLUsh: an extendible research environment. 689-692 - Tibor Ferenczi, Géza Németh, Gábor Olaszy, Zoltan Gaspar:
A flexible client-server model for multilingual CTS/TTS development. 693-696 - Unto K. Laine:
Critically sampled PR filterbanks of nonuniform resolution based on block recursive FAMlet transform. 697-700 - Nobuaki Minematsu, Nariaki Ohashi, Seiichi Nakagawa:
Automatic detection of accent in English words spoken by Japanese students. 701-704 - Yasuhiro Taniguchi, Allan A. Reyes, Hideyuki Suzuki, Seiichi Nakagawa:
An English conversation and pronunciation CAI system using speech recognition technology. 705-708 - Stephen Sutton, Edward C. Kaiser, A. Cronk, Ronald A. Cole:
Bringing spoken language systems to the classroom. 709-712 - Catia Cucchiarini, Lou Boves:
Automatic assessment of foreign speakers' pronunciation of dutch. 713-716 - John F. Holzrichter, Gregory C. Burnett:
Use of low power EM radar sensors for speech articulator measurements. 717-720 - Julien Epps, Annette Dowd, John Smith, Joe Wolfe:
Real time measurements of the vocal tract resonances during speech. 721-724
Phonetics and Phonology
- Eleonora Cavalcante Albano, Patrícia Aparecida Aquino:
Linguistic criteria for building and recording units for concatenative speech synthesis in brazilian portuguese. 725-728 - Knut Kvale, Arne Kjell Foldvik:
"four-and-twenty, twenty-four". what's in a number? 729-732 - João Antônio de Moraes:
Vowel nasalization in Brazilian Portuguese: an articulatory investigation. 733-736 - Elena Steriopolo:
Rhythmic organization pecularities of the spoken text. 737-738 - Bernhard Rueber:
Obtaining confidence measures from sentence probabilities. 739-742 - Yiqing Zu:
Sentence design for speech synthesis and speech recognition database by phonetic rules. 743-746 - Christoph Draxler, Susanne Burger:
Identification of regional variants of high German from digit sequences in German telephone speech. 747-750 - Darya Kavitskaya:
Aerodynamic constraints on the production of palatalized trills: the case of the Slavic trilled [r]. 751-754 - Cheol-jae Seong, Sanghun Kim:
An experimental phonetic study of the interrelationship between prosodic phrase and syntactic structure. 755-758 - Sebastian J. G. G. Heid:
Individual differences between vowel systems of German speakers. 759-762 - Anton Batliner, Andreas Kießling, Ralf Kompe, Heinrich Niemann, Elmar Nöth:
Tempo and its change in spontaneous speech. 763-766 - Bojan Petek, Rastislav Sustarsic:
A corpus-based approach to diphthong analysis of standard Slovenian. 767-770 - Lourdes Aguilar, Julia A. Gimenez, Maria Machuca, Rafael Marín, Montse Riera:
Catalan vowel duration. 771-774 - Maria Rosaria Caputo:
The intonation of vocatives in spoken Neapolitan Italian. 775-778 - Emanuela Magno Caldognetto, Claudio Zmarich, Franco Ferrero:
A comparative acoustic study of spontaneous and read Italian speech. 779-782 - Mario Refice, Michelina Savino, Martine Grice:
A contribution to the estimation of naturalness in the intonation of Italian spontaneous speech. 783-786 - Sylvia Moosmller:
Diphthongs and the process of monophthongization in Austrian German: a first approach. 787-790 - Steve Hoskins:
The prosody of broad and narrow focus in English: two experiments. 791-794 - Alice Turk, Laurence White:
The domain of accentual lengthening in Scottish English. 795-798 - Mariette Bessac, Geneviève Caelen-Haumont:
Spontaneous dialogue: some results about the F0 predictions of a pragmatic model of information processing. 799-802 - Didier Demolin, Bernard Teston:
Phonetic characteristics of double articulations in some Mangbutu-efe languages. 803-806 - Inmaculada Hernáez, Iñaki Gaminde, Borja Etxebarria, Pilartxo Etxebarria:
Intonation modeling for the southern dialects of the Basque language 807. 807-809 - Peter O'Boyle, Ji Ming, Marie Owens, Francis Jack Smith:
From phone identification to phone clustering using mutual information. 2391-2394 - Ahmed-Réda Berrah, Rafael Laboissière:
Phonetic code emergence in a society of speech robots: explaining vowel systems and the MUAF principle. 2395-2398 - Inger Moen, Hanne Gram Simonsen:
Effects of voicing on /t, d/ tongue/palate contact in English and norwegian. 2399-2402 - Peter Ladefoged, Gunnar Fant:
Fieldwork techniques for relating formant frequency, amplitude and bandwidth. 2403-2406 - Xue Wang, Louis C. W. Pols:
Word juncture modelling based on the TIMIT database. 2407-2410 - Motoko Ueyama:
The phonology and phonetics of second language intonation: the case of "Japanese English". 2411-2414
Confidence Measures in ASR
- Pablo Fetter, Udo Haiber, Peter Regel-Brietzmann:
A low-cost phonetic transcription method. 811-814 - Lin Lawrence Chase:
Word and acoustic confidence annotation for large vocabulary speech recognition. 815-818 - Zachary Bergen, Wayne H. Ward:
A senone based confidence measure for speech recognition. 819-822 - Erica G. Bernstein, Ward R. Evans:
OOV utterance detection based on the recognizer response function. 823-826 - Thomas Kemp, Thomas Schaaf:
Estimating confidence using word lattices. 827-830 - Man-Hung Siu, Herbert Gish, Fred Richardson:
Improved estimation, evaluation and applications of confidence measures for speech recognition. 831-834
Speaker and Language Identification
- Salleh Hussain, Fergus R. McInnes, Mervyn A. Jack:
Improved speaker verification system with limited training data on telephone quality speech. 835-838 - Qi Li, Biing-Hwang Juang, Qiru Zhou, Chin-Hui Lee:
Verbal information verification. 839-842 - Sridevi V. Sarma, Victor W. Zue:
A segment-based speaker verification system using SUMMIT. 843-846 - Michael Sokolov:
Speaker verification on the world wide web. 847-850 - Johan Lindberg, Håkan Melin:
Text-prompted versus sound-prompted passwords in speaker verification systems. 851-854 - Michael Schmidt, John Golden, Herbert Gish:
GMM sample statistic log-likelihoods for text-independent speaker recognition. 855-858
Perception of Prosody
- Toni C. M. Rietveld, Carlos Gussenhoven:
The influence of phrase boundaries on perceived prominence in two-peak intonation contours. 859-862 - Johanneke Caspers:
Testing the meaning of four dutch pitch accent types. 863-866 - Joachim Mersdorf, Thomas Domhover:
A perceptual study for modelling speaker-dependent intonation in TTS and dialog systems. 867-870 - Véronique Aubergé, Tuulikki Grepillat, Albert Rilliard:
Can we perceive attitudes before the end of sentences? the gating paradigm for prosodic contours. 871-874 - Mattias Heldner, Eva Strangert:
To what extent is perceived focus determined by F0-cues? 875-878 - David House, Dik J. Hermes, Frédéric Beaugendre:
Temporal-alignment categories of accent-lending rises and falls. 879-882
Applications of Speech Technology
- Raymond Lau, Giovanni Flammia, Christine Pao, Victor W. Zue:
Webgalaxy - integrating spoken language and hypertext navigation. 883-886 - Michael J. Carey, Eluned S. Parris, Graham Tattersall:
Pitch estimation of singing for re-synthesis and musical transcription. 887-890 - Christian Martyn Jones, Satnam Singh Dlay:
Automated lip synchronisation for human-computer interaction and special effect animation. 891-894 - Charles T. Hemphill, Yeshwant K. Muthusamy:
Developing web-based speech applications. 895-898 - Werner Verhelst:
Automatic post-synchronization of speech utterances. 899-902 - Jordi Robert-Ribes, Rami G. Mukhtar:
Automatic generation of hyperlinks between audio and transcript. 903-906 - Sebastian Möller, Rainer Schönweiler:
Analysis of infant cries for the early detection of hearing impairment. 1759-1762 - Athanassios Hatzis, Phil D. Green, S. J. Howard:
Optical logo-therapy (OLT): a computer-based real time visual feedback application for speech training. 1763-1766 - Sung-Chien Lin, Lee-Feng Chien, Ming-Chiuan Chen, Lin-Shan Lee, Keh-Jiann Chen:
Intelligent retrieval of very large Chinese dictionaries with speech queries. 1767-1770 - Fulvio Leonardi, Giorgio Micca, Sheyla Militello, Mario Nigra:
Preliminary results of a multilingual interactive voice activated telephone service for people-on-the-move. 1771-1774 - Jean-Christophe Dubois, Yolande Anglade, Dominique Fohr:
Assessment of an operational dialogue system used by a blind telephone switchboard operator. 1775-1778 - Antonio J. Rubio, Pedro García-Teodoro, Ángel de la Torre, José C. Segura, Jesús Esteban Díaz Verdejo, Maria C. Benitez, Victoria E. Sánchez, Antonio M. Peinado, Juan M. López-Soler, José L. Pérez-Córdoba:
STACC: an automatic service for information access using continuous speech recognition through telephone line. 1779-1782 - Ramón López-Cózar, Pedro García-Teodoro, Jesús Esteban Díaz Verdejo, Antonio J. Rubio:
A voice activated dialogue system for fast-food restaurant applications. 1783-1786 - Paul W. Shields, Douglas R. Campbell:
Multi-microphone sub-band adaptive signal processing for improvement of hearing aid performance. 1787-1790 - Hans Georg Piroth, Thomas Arnhold:
Tactile transmission of intonation and stress. 1791-1794 - Kerttu Huttunen, Pentti Körkkö, Martti Sorri:
Hearing impairment simulation: an interactive multimedia programme on the internet for students of speech therapy. 1795-1798 - Sorin Ciocea, Jean Schoentgen, Lise Crevier-Buchman:
Analysis of dysarthric speech by means of formant-to-area mapping. 1799-1802 - Boris Lobanov, Simon V. Brickle, Andrey V. Kubashin, Tatiana V. Levkovskaja:
An intelligent telephone answering system using speech recognition. 1803-1806 - Ulla Ackermann, Bianca Angelini, Fabio Brugnara, Marcello Federico, Diego Giuliani, Roberto Gretter, Heinrich Niemann:
Speedata: a prototype for multilingual spoken data-entry. 1807-1810 - Matti Karjalainen, Péter Boda, Panu Somervuo, Toomas Altosaar:
Applications for the hearing-impaired: evaluation of finnish phoneme recognition methods. 1811-1814 - Nina Alarotu, Mietta Lennes, Toomas Altosaar, Anja Malm, Matti Karjalainen:
Applications for the hearing-impaired: comprehension of finnish text with phoneme errors. 1815-1818 - Ute Ehrlich, Gerhard Hanrieder, Ludwig Hitzenberger, Paul Heisterkamp, Klaus Mecklenburg, Peter Regel-Brietzmann:
Access - automated call center through speech understanding system. 1819-1822 - E. Richard Anthony, Charles Bowen, Margot T. Peet, Susan G. Tammaro:
Integrating a radio model with a spoken language interface for military simulations. 1823-1826 - Daniele Falavigna, Roberto Gretter:
On field experiments of continuous digit recognition over the telephone network. 1827-1830 - Xavier Menéndez-Pidal, James B. Polikoff, H. Timothy Bunnell:
An HMM-based phoneme recognizer applied to assessment of dysarthric speech. 1831-1834 - Celinda de la Torre, Gonzalo Alonso:
Multiapplication platform based on technology for mobile telephone network services. 1835-1838 - Els den Os, Lou Boves, David A. James, Richard Winski, Kurt Fridh:
Field test of a calling card service based on speaker verification and automatic speech recognition. 1839-1842 - Luc E. Julia, Adam Cheyer:
Speech: a privileged modality. 1843-1846
Spontaneous Speech Recognition
- Jean-Luc Gauvain, Lori Lamel, Gilles Adda, Martine Adda-Decker:
Transcription of broadcast news. 907-910 - Fil Alleva, Xuedong Huang, Mei-Yuh Hwang, Li Jiang:
Can continuous speech recognizers handle isolated speech? 911-914 - Tatsuo Matsuoka, Yuichi Taguchi, Katsutoshi Ohtsuki, Sadaoki Furui, Katsuhiko Shirai:
Toward automatic transcription of Japanese broadcast news. 915-918 - Mauro Cettolo, Anna Corazza:
Automatic detection of semantic boundaries. 919-922 - Etienne Bauche, Bojana Gajic, Yasuhiro Minami, Tatsuo Matsuoka, Sadaoki Furui:
Connected digit recognition in spontaneous speech. 923-926 - Francis Kubala, Hubert Jin, Spyros Matsoukas, Long Nguyen, Richard M. Schwartz, John Makhoul:
Advances in transcription of broadcast news. 927-930
Language Specific Segmental Features
- Tina Cambier-Langeveld, Marina Nespor, Vincent J. van Heuven:
The domain of final lengthening in production and perception in Dutch. 931-934 - Christine Meunier:
Voicing assimilation as a cue for cluster identification. 935-938 - Saskia te Riele, Manon Loef, Olga van Herwijnen:
On the perceptual relevance of degemination in Dutch. 939-942 - Cécile Fougeron, Donca Steriade:
Does deletion of French SCHWA lead to neutralization of lexical distinctions? 943-946 - Marielle Bruyninckx, Bernard Harmegnies:
An approach of the catalan palatals discrimination based on durational patterns of spectral evolution. 947-950 - Jerneja Gros, Nikola Pavesic, France Mihelic:
Syllable and segment duration at different speaking rates in the Slovenian language. 951-954
Speaker Recognition
- Wei-Ying Li, Douglas D. O'Shaughnessy:
Hybrid networks based on RBFN and GMM for speaker recognition. 955-958 - Jialong He, Li Liu, Günther Palm:
A discriminative training algorithm for Gaussian mixture speaker models. 959-962 - Douglas A. Reynolds:
Comparison of background normalization methods for text-independent speaker verification. 963-966 - Owen Kimball, Michael Schmidt, Herbert Gish, Jason Waterman:
Speaker verification with limited enrollment data. 967-970 - Frédéric Bimbot, Hans-Peter Hutter, Cédric Jaboulet, Johan Koolwaaij, Johan Lindberg, Jean-Benoît Pierrot:
Speaker verification in the telephone network: research activities in the cave project. 971-974 - Mark Kuitert, Lou Boves:
Speaker verification with GSM coded telephone speech. 975-978 - Aaron E. Rosenberg, Sarangarajan Parthasarathy:
Speaker identification with user-selected password phrases. 1371-1374 - Jesper Ø. Olsen:
Speaker verification based on phonetic decision making. 1375-1378 - Aladdin M. Ariyaeeinia, P. Sivakumaran:
Analysis and comparison of score normalisation methods for text-dependent speaker verification. 1379-1382 - Frédéric Jauquet, Patrick Verlinde, Claude Vloeberghs:
Automatic speaker recognition on a vocoder link. 1383-1386 - Frédéric Bimbot, Dominique Genoud:
Likelihood ratio adjustment for the compensation of model mismatch in speaker verification. 1387-1390 - M. Kemal Sönmez, Larry P. Heck, Mitchel Weintraub, Elizabeth Shriberg:
A lognormal tied mixture model of pitch for prosody based speaker recognition. 1391-1394
Speech Synthesis: Linguistic Analysis
- Nick Campbell, Tony Hebert, Ezra Black:
Parsers, prominence, and pauses. 979-982 - Frédéric Béchet, Marc El-Bèze:
Automatic assignment of part-of-speech to out-of-vocabulary words for text-to-speech processing. 983-986 - Barbara Gili Fivela, Silvia Quazza:
Text-to-prosody parsing in an Italian speech synthesizer. recent improvements. 987-990 - Brigitte Krenn:
Tagging syllables. 991-994 - Alan W. Black, Paul Taylor:
Assigning phrase breaks from part-of-speech sequences. 995-998 - Christina Widera, Thomas Portele, Maria Wolters:
Prediction of word prominence. 999-1002
Speech Analysis and Modelling
- Hisao Kuwabara:
Acoustic and perceptual properties of phonemes in continuous speech as a function of speaking rate. 1003-1006 - Shrikanth S. Narayanan, Abeer Alwan, Yong Song:
New results in vowel production: MRI, EPG, and acoustic data. 1007-1010 - Takayuki Arai, Steven Greenberg:
The temporal properties of spoken Japanese are similar to those of English. 1011-1014 - Anna Esposito:
The amplitudes of the peaks in the spectrum: data from /a/ context. 1015-1018 - Natalija Bolfan-Stosic, Mladen Hedjever:
Acoustical characteristics of speech and voice in speech pathology. 1019-1022 - Andreas Kipp, Maria-Barbara Wesenick, Florian Schiel:
Pronuncation modeling applied to automatic segmentation of spontaneous speech. 1023-1026 - Simon Downey, Richard Wiseman:
Dynamic and static improvements to lexical baseforms. 1027-1030 - Andreas Hauenstein:
Signal driven generation of word baseforms from few examples. 1031-1034 - Elizabeth C. Botha, Louis C. W. Pols:
Modeling the acoustic differences between L1 and L2 speech: the short vowels of africaans and south-african English. 1035-1038 - Béatrice Vaxelaire, Rudolph Sock:
Laryngeal movements and speech rate: an x-ray investigation. 1039-1042 - Anders Eriksson, Pär Wretling:
How flexible is the human voice? - a case study of mimicry. 1043-1046 - Helmer Strik:
The effect of low-pass filtering on estimated voice source parameters. 1047-1050 - Susan M. Fosnot:
Vowel development of /i/ and /u/ in 15-36 month old children at risk and not at risk to stutter. 1051-1054 - Alan Wrench, Alan D. McIntosh, William J. Hardcastle:
Optopalatograph: development of a device for measuring tongue movement in 3D. 1055-1058 - Juana M. Gutiérrez-Arriola, Francisco M. Gimenez de los Galanes, Mohammad Hasan Savoji, José Manuel Pardo:
Speech synthesis and prosody modification using segmentation and modelling of the excitation signal. 1059-1062 - Christophe Savariaux, Louis-Jean Boë, Pascal Perrier:
How can the control of the vocal tract limit the speaker's capability to produce the ultimate perceptive objectives of speech? 1063. 1063-1066 - Goran S. Jovanovic:
A step toward general model for symbolic description of the speech signal 1067. 1067-1070 - Kiyoshi Furukawa, Masayuki Nakazawa, Takashi Endo, Ryuichi Oka:
Referring in long term speech by using orientation patterns obtained from vector field of spectrum pattern. 1071-1074
Dialogue Systems: Design and Applications
- J. Barnett, Stephen W. Anderson, J. Broglio, Mona Singh, Randy Hudson, S. W. Kuo:
Experiments in spoken queries for document retrieval. 1323-1326 - Frank Seide, Andreas Kellner:
Towards an automated directory information system. 1327-1330 - Lars Bo Larsen:
A strategy for mixed-initiative dialogue control. 1331-1334 - Jim Hugunin, Victor W. Zue:
On the design of effective speech-based interfaces for desktop applications. 1335-1338 - Matthias Denecke, Alex Waibel:
Dialogue strategies guiding users to their communicative goals. 1339-1342 - Sunil Issar:
A speech interface for forms on WWW. 1343-1346 - Giovanni Flammia, Victor W. Zue:
Learning the structure of mixed initiative dialogues using a corpus of annotated conversations 1. 1871-1874 - Roberto Pieraccini, Esther Levin, Wieland Eckert:
AMICA: the AT&t mixed initiative conversational architecture. 1875-1878 - Alicia Abella, Allen L. Gorin:
Generating semantically consistent inputs to a dialog manager. 1879-1882 - Esther Levin, Roberto Pieraccini:
A stochastic model of computer-human interaction for learning dialogue strategies. 1883-1886 - Manuela Boros, Maria Aretoulaki, Florian Gallwitz, Elmar Nöth, Heinrich Niemann:
Semantic processing of out-of-vocabulary words in a spoken dialogue system. 1887-1890 - Elisabeth Maier:
Clarification dialogues in VERBMOBIL. 1891-1894
Speech Production Modelling
- Levent M. Arslan, David Talkin:
Voice conversion by codebook mapping of line spectral frequencies and excitation spectrum. 1347-1350 - Chafic Mokbel, Guillaume Gravier, Gérard Chollet:
Optimal state dependent spectral representation for HMM modeling : a new theoretical framework. 1351-1354 - Alexandros Potamianos, Petros Maragos:
Speech analysis and synthesis using an AM-FM modulation model. 1355-1358 - Khaled Mawass, Pierre Badin, Gérard Bailly:
Synthesis of fricative consonants by audiovisual-to-articulatory inversion. 1359-1362 - Tom Claes, Ioannis Dologlou, Louis ten Bosch, Dirk Van Compernolle:
New transformations of cepstral parameters for automatic vocal tract length normalization in speech recognition. 1363-1366 - Simon Dobrisek, France Mihelic, Nikola Pavesic:
A multiresolutionally oriented approach for determination of cepstral features in speech recognition. 1367-1370
Speech Enhancement and Noise Mitigation
- Tim Haulick, Klaus Linhard, Peter Schrogmeier:
Residual noise suppression using psychoacoustic criteria. 1395-1398 - B. Yegnanarayana, Carlos Avendaño, Hynek Hermansky, P. Satyanarayana Murthy:
Processing linear prediction residual for speech enhancement. 1399-1402 - Stefan Gustafsson, Rainer Martin:
Combined acoustic echo control and noise reduction for mobile communications. 1403-1406 - Ki Yong Lee, JaeYeol Rheem:
A nonstationary autoregressive HMM and its application to speech enhancement. 1407-1410 - Néstor Becerra Yoma, Fergus R. McInnes, Mervyn A. Jack:
Spectral subtraction and mean normalization in the context of weighted matching algorithms. 1411-1414 - Dionysis E. Tsoukalas, John Mourjopoulos, George Kokkinakis:
Improving the intelligibility of noisy speech using an audible noise suppression technique. 1415-1418 - Laurent Girin, Gang Feng, Jean-Luc Schwartz:
Noisy speech enhancement by fusion of auditory and visual information: a study of vowel transitions. 2555-2558 - Andreas Engelsberg, Thomas Gülzow:
Spectral subtraction using a non-critically decimated discrete wavelet transform. 2559-2562 - Jen-Tzung Chien, Hsiao-Chuan Wang, Chin-Hui Lee:
Bayesian affine transformation of HMM parameters for instantaneous and supervised adaptation in telephone speech recognition. 2563-2566 - Craig Lawrence, Mazin G. Rahim:
Integrated bias removal techniques for robust speech recognition \lambda. 2567-2570 - Detlev Langmann, Alexander Fischer, Friedhelm Wuppermann, Reinhold Haeb-Umbach, Thomas Eisele:
Acoustic front ends for speaker-independent digit recognition in car environments. 2571-2574 - Lionel Delphin-Poulat, Chafic Mokbel:
Signal bias removal using the multi-path stochastic equalization technique. 2575-2578 - Andrej Miksic, Bogomir Horvat:
Subband echo cancellation in automatic speech dialog systems. 2579-2582 - Hesham Tolba, Douglas D. O'Shaughnessy:
Speech enhancement via energy separation. 2583-2586 - Masashi Unoki, Masato Akagi:
A method of signal extraction from noisy signal. 2587-2590 - Jiri Sika, Vratislav Davidek:
Multi-channel noise reduction using wavelet filter bank. 2591-2594 - Imad Abdallah, Silvio Montrésor, Marc Baudry:
Speech signal detection in noisy environement using a local entropic criterion. 2595-2598 - Pedro J. Moreno, Brian S. Eberman:
A new algorithm for robust speech recognition: the delta vector taylor series approach. 2599-2602 - David R. Cole, Miles Moody, Sridha Sridharan:
Robust enhancement of reverberant speech using iterative noise removal. 2603-2606 - D. J. Jones, Scott D. Watson, Kenneth G. Evans, Barry M. G. Cheetham, R. A. Reeve:
A network speech echo canceller with comfort noise. 2607-2610 - Amir Hussain, Douglas R. Campbell, Thomas J. Moir:
A new metric for selecting sub-band processing in adaptive speech enhancement systems. 2611-2614 - Hidefumi Kobatake, Hideta Suzuki:
Estimation of LPC cepstrum vector of speech contaminated by additive noise and its application to speech enhancement. 2615-2618 - Sangita Tibrewala, Hynek Hermansky:
Multi-band and adaptation approaches to robust speech recognition. 2619-2622 - Enrique Masgrau, Eduardo Lleida, Luis Vicente:
Non-quadratic criterion algorithms for speech enhancement. 2623-2626
Spoken Language Understanding
- Jeremy H. Wright, Allen L. Gorin, Giuseppe Riccardi:
Automatic acquisition of salient grammar fragments for call-type classification. 1419-1422 - Wolfgang Minker:
Stochastically-based natural language understanding across tasks and languages. 1423-1426 - Michael Riley, Fernando Pereira, Mehryar Mohri:
Transducer composition for context-dependent network expansion. 1427-1430 - Christian Lieske, Johan Bos, Martin C. Emele, Björn Gambäck, C. J. Rupp:
Giving prosody a meaning. 1431-1434 - Kishore Papineni, Salim Roukos, Todd Ward:
Feature-based language understanding. 1435-1438 - Juan-Carlos Amengual, José-Miguel Benedí, Klaus Beulen, Francisco Casacuberta, M. Asunción Castaño, Antonio Castellanos, Víctor M. Jiménez, David Llorens, Andrés Marzal, Hermann Ney, Federico Prat, Enrique Vidal, Juan Miguel Vilar:
Speech translation based on automatically trainable finite-state models. 1439-1442
Language Model Adaptation
- Yoshihiko Gotoh, Steve Renals:
Document space models using latent semantic analysis. 1443-1446 - Sven C. Martin, Jörg Liermann, Hermann Ney:
Adaptive topic - dependent language modelling using word - based varigrams. 1447-1450 - Jerome R. Bellegarda:
A latent semantic analysis framework for large-Span language modeling. 1451-1454 - Richard M. Schwartz, Toru Imai, Francis Kubala, Long Nguyen, John Makhoul:
A maximum likelihood model for topic classification of broadcast news. 1455-1458 - Cosmin Popovici, Paolo Baggia:
Language modelling for task-oriented domains. 1459-1462 - Sung-Chien Lin, Chi-Lung Tsai, Lee-Feng Chien, Keh-Jiann Chen, Lin-Shan Lee:
Chinese language model adaptation based on document classification and multiple domain-specific language models. 1463-1466
Prosody and Speech Recognition/Understanding
- Philippe Langlais:
Estimating prosodic weights in a syntactic-rhythmical prediction system. 1467-1470 - Kazuhiko Ozeki, Kazuyuki Kousaka, Yujie Zhang:
Syntactic information contained in prosodic features of Japanese utterances. 1471-1474 - Grace Chung, Stephanie Seneff:
Hierarchical duration modelling for speech recognition using the ANGIE framework. 1475-1478 - Volker Strom, Anja Elsner, Wolfgang Hess, Walter Kasper, Alexandra Klein, Hans-Ulrich Krieger, Jörg Spilker, Hans Weber, Günther Görz:
On the use of prosody in a speech-to-speech translator. 1479-1482 - Vincent J. van Heuven, Judith Haan, Jos J. A. Pacilly:
Automatic recognition of sentence type from prosody in dutch. 1483-1486 - Paul Munteanu, Bertrand Caillaud, Jean-François Serignat, Geneviève Caelen-Haumont:
Automatic word demarcation based on prosody. 1487-1490
Wideband Speech Coding
- Akitoshi Kataoka, Sachiko Kurihara, Shigeaki Sasaki, Shinji Hayashi:
A 16-kbit/s wideband speech codec scalable with g.729. 1491-1494 - M. Lynch, Eliathamby Ambikairajah, Andrew Davis:
Comparison of auditory masking models for speech coding. 1495-1498 - A. Amodio, Gang Feng:
Wideband speech coding based on the MBE structure. 1499-1502 - Marcos Perreau Guimaraes, Nicolas Moreau, Madeleine Bonnet:
Perceptual filter comparisons for wideband and FM bandwidth audio coders. 1503-1506 - Cheung-Fat Chan, Man-Tak Chu:
Wideband coding of speech using neural network gain adaptation. 1507-1510 - Josep M. Salavedra:
Wideband-speech APVQ coding from 16 to 32 kbps. 1511-1514
Speech Recognition in Adverse Environments CSR and Error Analysis
- Wei-Wen Hung, Hsiao-Chuan Wang:
A comparative analysis of blind channel equalization methods for telephone speech recognition. 1515-1518 - Wei-Wen Hung, Hsiao-Chuan Wang:
HMM retraining based on state duration alignment for noisy speech recognition. 1519-1522 - Yasuhiro Komori, Tetsuo Kosaka, Hiroki Yamamoto, Masayuki Yamada:
Fast parallel model combination noise adaptation processing. 1523-1526 - Takashi Endo, Shigeki Nagaya, Masayuki Nakazawa, Kiyoshi Furukawa, Ryuichi Oka:
Speech recognition module for CSCW using a microphone array. 1527-1530 - Jiqing Han, Munsung Han, Gyu-Bong Park, Jeongue Park, Wen Gao:
Relative mel-frequency cepstral coefficients compensation for robust telephone speech recognition. 1531-1534 - Seiichi Yamamoto, Masaki Naito, Shingo Kuroiwa:
Robust speech detection method for speech recognition system for telecommunication networks and its field trial. 1535-1538 - Laurent Mauuary, Lamia Karray:
The tuning of speech detection in the context of a global evaluation of a voice response system. 1539-1542 - C. Julian Chen, Ramesh A. Gopinath, Michael D. Monkowski, Michael A. Picheny, Katherine Shen:
New methods in continuous Mandarin speech recognition. 1543-1546 - Michelle S. Spina, Victor W. Zue:
Automatic transcription of general audio data: effect of environment segmentation on phonetic recognition 1. 1547-1550 - Alfred Ying Pang Ng, Lai-Wan Chan, P. C. Ching:
Automatic recognition of continuous Cantonese speech with very large vocabulary. 1551-1554 - Yifan Gong:
Source normalization training for HMM applied to noisy telephone speech recognition. 1555-1558 - Joao P. Neto, Ciro Martins, Luís B. Almeida:
The development of a speaker independent continuous speech recognizer for portuguese. 1559-1562 - Lin Lawrence Chase:
Blame assignment for errors made by large vocabulary speech recognizers. 1563-1566 - Atsushi Nakamura:
Predicting speech recognition performance. 1567-1570 - Scott D. Watson, Barry M. G. Cheetham, P. A. Barrett, W. T. K. Wong, A. V. Lewis:
A voice activity detector for the ITU-t 8kbit/s speech coding standard g.729. 1571-1574 - Yeshwant K. Muthusamy, John J. Godfrey:
Vocabulary-independent recognition of american Spanish phrases and digit strings. 1575-1578 - Michael Meyer, Hermann Hild:
Recognition of spoken and spelled proper names. 1579-1582 - Takao Kobayashi, Takashi Masuko, Keiichi Tokuda:
HMM compensation for noisy speech recognition based on cepstral parameter generation. 1583-1586 - George Nokas, Evangelos Dermatas, George Kokkinakis:
On the robustness of the critical-band adaptive filtering method for multi-source noisy speech recognition. 1587-1590 - Cuntai Guan, Shu-hung Leung, Wing Hong Lau:
A space transformation approach for robust speech recognition in noisy environments. 1591-1594 - Tzur Vaich, Arnon Cohen:
Robust isolated word recognition using WSP-PMC combination. 1595-1598
Multimodal Speech Processing, Emerging Techniques and Applications
- Spyros Raptis, George Carayannis:
Fuzzy logic for rule-based formant speech synthesis. 1599-1602 - Pierre Jourlin, Juergen Luettin, Dominique Genoud, Hubert Wassner:
Integrating acoustic and labial information for speaker identification and verification. 1603-1606 - Kenney Ng, Victor W. Zue:
Subword unit representations for spoken document retrieval. 1607-1610 - Pascal Teissier, Jean-Luc Schwartz, Anne Guérin-Dugué:
Non-linear representations, sensor reliability estimation and context-dependent fusion in the audiovisual recognition of speech in noise. 1611-1614 - Philippe Renevey, Andrzej Drygajlo:
Securized flexible vocabulary voice messaging system on unix workstation with ISDN connection. 1615-1618 - Houda Mokbel, Denis Jouvet:
Automatic derivation of multiple variants of phonetic transcriptions from acoustic signals. 1619-1622 - Satoshi Nakamura, Ron Nagai, Kiyohiro Shikano:
Improved bimodal speech recognition using tied-mixture HMMs and 5000 word audio-visual synchronous database. 1623-1626 - Philippe Depambour, Régine André-Obrecht, Bernard Delyon:
On the use of phone duration and segmental processing to label speech signal. 1627-1630 - Martin Paping, Thomas Fahnle:
Automatic detection of disturbing robot voice- and ping pong-effects in GSM transmitted speech. 1631-1634 - Joseph Di Martino:
Speech synthesis using phase vocoder techniques. 1635-1638 - Ramesh R. Sarukkai, Craig Hunter:
Integration of eye fixation information with speech recognition systems. 1639-1643 - Yoshihisa Nakatoh, Mineo Tsushima, Takeshi Norimatsu:
Generation of broadband speech from narrowband speech using piecewise linear mapping. 1643-1646 - Ian E. C. Rogers:
An assessment of the benefits active noise reduction systems provide to speech intelligibility in aircraft noise environments. 1647-1650 - Jonas Beskow, Kjell Elenius, Scott McGlashan:
OLGA - a dialogue system with an animated talking agent. 1651-1654 - Sandrine Robbe, Noelle Carbonell, Claude Valot:
Towards usable multimodal command languages: definition and ergonomic assessment of constraints on users' spontaneous speech and gestures. 1655-1658 - Bernhard Suhm, Alex Waibel:
Exploiting repair context in interactive error recovery. 1659-1662 - Lionel Revéret, Frederique Garcia, Christian Benoît, Eric Vatikiotis-Bateson:
An hybrid image processing approach to liptracking independent of head orientation. 1663-1666 - Bertrand Le Goff:
Automatic modeling of coarticulation in text-to-visual speech synthesis. 1667-1670 - Ali Adjoudani, Thierry Guiard-Marigny, Bertrand Le Goff, Lionel Revéret, Christian Benoît:
A multimedia platform for audio-visual speech processing. 1671-1674 - Hiroya Fujisaki, Hiroyuki Kameda, Sumio Ohno, Takuya Ito, Ken Tajima, Kenji Abe:
An intelligent system for information retrieval over the internet through spoken dialogue. 1675-1678 - Yasemin Yardimci, A. Enis Çetin, Rashid Ansari:
Data hiding in speech using phase coding. 1679-1682 - Denis Burnham, John Fowler, Michelle Nicol:
CAVE: an on-line procedure for creating and running auditory-visual speech perception experiments-hardware, software, and advantages. 1683-1686
Databases, Tools and Evaluations
- Florian Schiel, Christoph Draxler, Hans G. Tillmann:
The bavarian archive for speech signals: resources for the speech community. 1687-1690 - Christoph Draxler:
WWWTranscribe - a modular transcription system based on the world wide web. 1691-1694 - Inger S. Engberg, Anya Varnich Hansen, Ove Andersen, Paul Dalsgaard:
Design, recording and verification of a danish emotional speech database. 1695-1698 - Maxine Eskénazi, Christopher Hogan, Jeffrey Allen, Robert E. Frederking:
Issues in database creation: recording new populations, faster and better labelling. 1699-1702 - Stefan Feldes, Bernhard Kaspar, Denis Jouvet:
Design and analysis of a German telephone speech database for phoneme based training. 1703-1706 - João Paulo Neto, Ciro Martins, Hugo Meinedo, Luís B. Almeida:
The design of a large vocabulary speech corpus for portuguese. 1707-1710 - Lennart Nord, Britta Hammarberg, Elisabet Lundstrom:
Continued investigations of laryngectomee speech in noise - measurements and intelligibility tests. 1711-1714 - Léon J. M. Rothkrantz, W. A. Th. Manintveld, M. M. M. Rats, Robert J. van Vark, J. P. M. de Vreught, Henk Koppelaar:
An appreciation study of an ASR inquiry system. 1715-1718 - Kamel Bensaber, Paul Munteanu, Jean-François Serignat, Pascal Perrier:
Object-oriented modeling of articulatory data for speech research information systems. 1719-1722 - Woosung Kim, Myoung-Wan Koo:
A Korean speech corpus for train ticket reservation aid system based on speech recognition. 1723-1726 - Dawn Dutton, Candace A. Kamm, Susan J. Boyce:
Recall memory for earcons. 1727-1730 - Odile Mella, Dominique Fohr:
Semi-automatic phonetic labelling of large corpora. 1731-1734 - Stefan Grocholewski:
CORPORA - speech database for Polish diphones. 1735-1738 - Christel Müller, Thomas Ziem:
Multilingual speech interfaces (MSI) and dialogue design environments for computer telephony services. 1739-1742 - John H. L. Hansen, Sahar E. Bou-Ghazale:
Getting started with SUSAS: a speech under simulated and actual stress database. 1743-1746 - Paul Taylor, Michael Tanenblatt, Amy Isard:
A markup language for text-to-speech synthesis richard sproat. 1747-1750 - Shuichi Itahashi, Naoko Ueda, Mikio Yamamoto:
Several measures for selecting suitable speech CORPORA. 1751-1754 - Irene Chatzi, Nikos Fakotakis, George Kokkinakis:
Greek speech database for creation of voice driven teleservices. 1755-1758
Speaker Adaptation I
- Qiang Huo, Chin-Hui Lee:
Combined on-line model adaptation and Bayesian predictive classification for robust speech recognition. 1847-1850 - Xavier L. Aubert, Eric Thelen:
Speaker adaptive training applied to continuous mixture density modeling. 1851-1854 - Irina Illina, Yifan Gong:
Speaker normalization training for mixture stochastic trajectory model. 1855-1858 - Vassilios Digalakis:
On-line adaptation of hidden Markov models using incremental estimation algorithms. 1859-1862 - Ashvin Kannan, Mari Ostendorf:
Modeling dependency in adaptation of acoustic models using multiscale tree processes. 1863-1866 - Larry P. Heck, Ananth Sankar:
Acoustic clustering and adaptation for robust speech recognition. 1867-1870
Assessment Methods
- Alvin F. Martin, George R. Doddington, Terri Kamm, Mark Ordowski, Mark A. Przybocki:
The DET curve in assessment of detection task performance. 1895-1898 - Harald Klaus, Ekkehard Diedrich, Astrid Dehnel, Jens Berger:
Speech quality evaluation of hands-free terminals. 1899-1902 - David S. Pallett, Jonathan G. Fiscus, William M. Fisher, John S. Garofolo:
Use of broadcast news materials for speech recognition benchmark tests. 1903-1906 - Norman M. Fraser:
Spoken dialogue system evaluation: a first framework for reporting results. 1907-1910 - Niels Ole Bernsen, Hans Dybkjær, Laila Dybkjær, Vytautas Zinkevicius:
Generality and transferability. two issues in putting a dialogue evaluation tool into practical use. 1911-1914 - David A. van Leeuwen, Herman J. M. Steeneken:
Within-speaker variability of the word error rate for a continuous speech recognition system. 1915-1918
Education for Language and Speech Communication
- Mark A. Huckvale, Christian Benoît, Chris Bowerman, Anders Eriksson, Mike Rosner, Mark Tatham, Briony Williams:
Opportunities for computer-aided instruction in phonetics and speech communication provided by the internet. 1919-1922 - Gerrit Bloothooft:
The landscape of future education in speech communication sciences. 1923-1926 - Kåre Sjölander, Joakim Gustafson:
An integrated system for teaching spoken dialogue systems technology. 1927-1930 - Janet Beck, Bernard Camilleri, Hilde Chantrain, Anu Klippi, Marianne Leterme, Matti Lehtihalmes, Peter Schneider, Wilhelm Vieregge, Eva Wigforss:
Communication science within education for logopedics/speech and language therapy in europe: the state of the art. 1931-1934 - Phil D. Green, Carlos Espain:
Education in spoken language engineering in europe. 1935-1938 - Valérie Hazan, Wim A. van Dommelen:
A survey of phonetics education in Europe. 1939-1942
Hybrid Systems for ASR
- Xin Tu, Yonghong Yan, Ronald A. Cole:
Matching training and testing criteria in hybrid speech recognition systems. 1943-1946 - Stéphane Dupont, Christophe Ris, Olivier Deroo, Vincent Fontaine, Jean-Marc Boite, L. Zanoni:
Context independent and context dependent hybrid HMM/ANN systems for vocabulary independent tasks. 1947-1950 - Jean Hennebert, Christophe Ris, Hervé Bourlard, Steve Renals, Nelson Morgan:
Estimation of global posteriors and forward-backward training of hybrid HMM/ANN systems. 1951-1954 - Gethin Williams, Steve Renals:
Confidence measures for hybrid HMM/ANN speech recognition. 1955-1958 - Gary D. Cook, Steve R. Waterhouse, Anthony J. Robinson:
Ensemble methods for connectionist acoustic modelling. 1959-1962 - Jürgen Fritsch, Michael Finke:
Improving performance on switchboard by combining hybrid HME/HMM and mixture of Gaussians acoustic models. 1963-1966
Topic and Dialogue Dependent Language Modelling
- Petra Witschel, Harald Höge:
Experiments in adaptation of language models for commercial applications. 1967-1970 - Reinhard Kneser, Jochen Peters, Dietrich Klakow:
Language model adaptation using dynamic marginals. 1971-1974 - Rukmini Iyer, Mari Ostendorf:
Transforming out-of-domain estimates to improve in-domain language models. 1975-1978 - P. Srinivasa Rao, Satya Dharanipragada, Salim Roukos:
MDI adaptation of language models across corpora. 1979-1982 - Klaus Ries:
A class based approach to domain adaptation and constraint integration for empirical m-gram models. 1983-1986 - Kristie Seymore, Ronald Rosenfeld:
Using story topics for language model adaptation. 1987-1990
Lipreading
- Juergen Luettin:
Towards speaker independent continuous speechreading. 1991-1994 - William Goldenthal, Keith Waters, Jean-Manuel Van Thong, Oren Glickman:
Driving synthetic mouth gestures: phonetic recognition for faceme! 1995-1998 - Alexandrina Rogozan, Paul Deléglise:
Continuous visual speech recognition using geometric lip-shape models and neural networks. 1999-2002 - Jonas Beskow, Martin Dahlquist, Björn Granström, Magnus Lundeberg, Karl-Erik Spens, Tobias Öhman:
The teleface project multi-modal speech-communication for the hearing impaired. 2003-2006 - Rainer Stiefelhagen, Uwe Meier, Jie Yang:
Real-time lip-tracking for lipreading. 2007-2010 - Lionel Revéret:
From raw images of the lips to articulatory parameters: a viseme-based prediction. 2011-2014
Articulatory Modelling
- Bruno Mathieu, Yves Laprie:
Adaptation of Maeda's model for acoustic to articulatory inversion. 2015-2018 - Yohan Payan, Pascal Perrier:
Why should speech control studies based on kinematics be considered with caution? insights from a 2d biomechanical model of the tongue. 2019-2022 - Vittorio Sanguineti, Rafael Laboissière, David J. Ostry:
An integrated model of the biomechanics and neural control of the tongue, jaw, hyoid and larynx system. 2023-2026 - Moshrefi Mohammad, E. Moore, John N. Carter, Christine H. Shadle, Steve R. Gunn:
Using MRI to image the moving vocal tract during speech. 2027-2030 - Eric Vatikiotis-Bateson, Hani Yehia:
Unified physiological model of audible-visible speech production. 2031-2034 - Hélène Loevenbruck, Pascal Perrier:
Motor control information recovering from the dynamics with the EP hypothesis. 2035-2038
Front-Ends and Adaptation to Acoustics Speaker Adaptation
- Yasuhiro Komori, Tetsuo Kosaka, Masayuki Yamada, Hiroki Yamamoto:
Speaker adaptation for context-dependent HMM using spatial relation of both phoneme context hierarchy and speakers. 2039-2042 - Masayuki Yamada, Yasuhiro Komori, Tetsuo Kosaka, Hiroki Yamamoto:
Fast algorithm for speech recognition using speaker cluster HMM. 2043-2046 - Timothy J. Hazen, James R. Glass:
A comparison of novel techniques for instantaneous speaker adaptation. 2047-2050 - Yoshikazu Yamaguchi, Satoshi Takahashi, Shigeki Sagayama:
Fast adaptation of acoustic models to environmental noise using jacobian adaptation algorithm. 2051-2054 - Ilija Zeljkovic, Shrikanth S. Narayanan, Alexandros Potamianos:
Unsupervised HMM adaptation based on speech-silence discrimination. 2055-2058 - Mohamed Afify, Yifan Gong, Jean Paul Haton:
Correlation based predictive adaptation of hidden Markov models. 2059-2062 - Vassilios Diakoloukas, Vassilios Digalakis:
Adaptation of hidden Markov models using multiple stochastic transformations. 2063-2066 - Mark J. F. Gales:
Transformation smoothing for speaker and environmental adaptation. 2067-2070 - Vincent Fontaine, Christophe Ris, Jean-Marc Boite:
Nonlinear discriminant analysis for improved speech recognition. 2071-2074 - Jürgen Tchorz, Klaus Kasper, Herbert Reininger, Birger Kollmeier:
On the interplay between auditory-based features and locally recurrent neural networks for robust speech recognition in noise. 2075-2078 - Nelson Morgan, Eric Fosler-Lussier, Nikki Mirghafori:
Speech recognition using on-line estimation of speaking rate. 2079-2082 - John N. Holmes, Wendy J. Holmes, Philip N. Garner:
Using formant frequencies in speech recognition. 2083-2086 - Puming Zhan, Martin Westphal, Michael Finke, Alex Waibel:
Speaker normalization and speaker adaptation - a combination for conversational speech recognition. 2087-2090 - Yuqing Gao, Mukund Padmanabhan, Michael Picheny:
Speaker adaptation based on pre-clustering training speakers. 2091-2094 - Mike Lincoln, Stephen Cox, Simon Ringland:
A fast method of speaker normalisation using formant estimation. 2095-2098 - Lutz Welling, N. Haberland, Hermann Ney:
Acoustic front-end optimization for large vocabulary speech recognition. 2099-2102 - Beth T. Logan, Anthony J. Robinson:
Improving autoregressive hidden Markov model recognition accuracy using a non-linear frequency scale with application to speech enhancement. 2103-2106 - Tsuneo Nitta, Akinori Kawamura:
Designing a reduced feature-vector set for speech recognition by using KL/GPD competitive training. 2107-2110 - Scott Shaobing Chen, Peter DeSouza:
Speaker adaptation by correlation (ABC). 2111-2114
Speech Perception
- William A. Ainsworth, Georg F. Meyer:
Preliminary experiments on the perception of double semivowels. 2115-2118 - Niels O. Schiller:
Does syllable frequency affect production time in a delayed naming task? 2119-2122 - Andrew C. Morris, Gerrit Bloothooft, William J. Barry, Bistra Andreeva, Jacques C. Koreman:
Human and machine identification of consonantal place of articulation from vocalic transition segments. 2123-2126 - Jon Barker, Martin Cooke:
Modelling the recognition of spectrally reduced speech. 2127-2130 - Christophe Pallier, Anne Cutler, Núria Sebastián-Gallés:
Prosodic structure and phonetic processing: a cross-linguistic study. 2131-2134 - R. J. J. H. van Son, Louis C. W. Pols:
The correlation between consonant identification and the amount of acoustic consonant reduction. 2135-2138 - Anne Bonneau:
Relevant spectral information for the identification of vowel features from bursts. 2139-2142 - Aijun Li:
Perceptual study of intersyllabic formant transitions in synthesized V1-V2 in standard Chinese. 2143-2146 - Oleg P. Skljarov:
Role of perception of rhythmically organized speech in consolidation process of long-term memory traces (LTM-traces) and in speech production controlling. 2147-2150 - Arie H. van der Lugt:
Sequential Probabilities As a Cue For Segmentation. 2151-2154 - Susan Jansens, Gerrit Bloothooft, Guus de Krom:
Perception and Acoustics of Emotions in Singing. 2155-2158 - Christophe Pallier:
Phonemes and syllables in speech perception: size of attentional focus in French. 2159-2162 - Shinichi Tokuma:
Quality of a vowel with formant undershoot: a preliminary perceptual study. 2163-2166 - Mariëtte Koster, Anne Cutler:
Segmental and suprasegmental contributions to spoken-word recognition in dutch. 2167-2170 - Dawn M. Behne, Peter E. Czigler, Kirk P. H. Sullivan:
Perception of vowel duration and spectral characteristics in Swedish. 2171-2174 - Adrian Neagu, Gérard Bailly:
Relative contributions of noise burst and vocalic transitions to the perceptual identification of stop consonants. 2175-2178 - Satoshi Kitagawa, Makoto Hashimoto, Norio Higuchi:
Effect of speaker familiarity and background noise on acoustic features used in speaker identification. 2179-2182 - Michel Pitermann:
Dynamic versus static specification for the perceptual identity of a coarticulated vowel. 2183-2186 - Madelaine Plauché, Cristina Delogu, John J. Ohala:
Asymmetries in consonant confusion. 2187-2190 - Nicolas Dumay, Monique Radeau:
Rime and syllabic effects in phonological priming between French spoken words. 2191-2194 - Weizhong Zhu, Hideki Kasuya:
Roles of static and dynamic features of formant trajectories in the perception of talk indedivduality. 2195-2198
Dialogue Systems: Linguistic Structures, Modelling and Evaluation
- Chih-mei Lin, Shrikanth S. Narayanan, E. Russell Ritenour:
Database management and analysis for spoken dialog systems: methodology and tools. 2199-2202 - Candace A. Kamm, Shrikanth S. Narayanan, Dawn Dutton, E. Russell Ritenour:
Evaluating spoken dialog systems for telecommunication services. 2203-2206 - Xavier Pouteau, Emiel Krahmer, Jan Landsbergen:
Robust spoken dialogue management for driver information systems. 2207-2210 - Yue-Shi Lee, Hsin-Hsi Chen:
Using acoustic and prosodic cues to correct Chinese speech repairs. 2211-2214 - Nils Dahlbäck, Arne Jönsson:
Integrating domain specific focusing in dialogue models. 2215-2218 - Marilyn A. Walker, Donald Hindle, Jeanne C. Fromer, Giuseppe Di Fabbrizio, Craig Mestel:
Evaluating competing agent strategies for a voice email agent. 2219-2222 - Donna K. Byron, Peter A. Heeman:
Discourse marker use in task-oriented spoken dialog \lambda. 2223-2226 - Victor W. Zue, Stephanie Seneff, James R. Glass, I. Lee Hetherington, Edward Hurley, Helen M. Meng, Christine Pao, Joseph Polifroni, Rafael Schloming, Philipp Schmid:
From interface to content: translingual access and delivery of on-line information. 2227-2230 - Jan Alexandersson, Norbert Reithinger:
Learning dialogue structures from a corpus. 2231-2234 - Norbert Reithinger, Martin Klesen:
Dialogue act classification using language models. 2235-2238 - Didier Pernel:
User's multiple goals in spoken dialogue. 2239-2242 - Noriko Suzuki, Seiji Inokuchi, Kazuo Ishii, Michio Okada:
Chatting with interactive agent. 2243-2246 - Gavin E. Churcher, Eric Atwell, Clive Souter:
Generic template for the evaluation of dialogue management systems. 2247-2250 - Yasuhisa Niimi, Takuya Nishimoto, Yutaka Kobayashi:
Analysis of interactive strategy to recover from misrecognition of utterances including multiple information items. 2251-2254 - François-Arnould Mathieu, Bertrand Gaiffe, Jean-Marie Pierrel:
A referential approach to reduce perplexity in the vocal command system comppa. 2255-2258 - Aristomenis Thanopoulos, Nikos Fakotakis, George Kokkinakis:
Linguistic processor for a spoken dialogue system based on island parsing techniques. 2259-2262 - Brian Mellor, Chris Baber:
Modelling of speech-based user interfaces. 2263-2266 - Beth Ann Hockey, Deborah Rossen-Knill, Beverly Spejewski, Matthew Stone, Stephen Isard:
Can you predict responses to yes/no questions? yes, no, and stuff. 2267-2270 - Jens-Uwe Möller:
Dia-moLE: an unsupervised learning approach to adaptive dialogue models for spoken dialogue systems. 2271-2274 - Joakim Gustafson, Anette Larsson, Rolf Carlson, K. Hellman:
How do system questions influence lexical choices in user answers? 2275-2278
Speaker Recognition and Language Identification
- Kuo-Hwei Yuo, Hsiao-Chuan Wang:
Gaussian mixture models with common principal axes and their application in text-independent speaker identification. 2279-2282 - Dominik R. Dersch, Robin W. King:
Speaker models designed from complete data sets: a new approach to text-independent speaker verification. 2283-2286 - Rivarol Vergin, Douglas D. O'Shaughnessy:
A double Gaussian mixture modeling approach to speaker recognition. 2287-2290 - Mohamed Afify, Yifan Gong, Jean Paul Haton:
An acoustic subword unit approach to non-linguistic speech feature identification. 2291-2294 - Chakib Tadj, Pierre Dumouchel, Yu Fang:
N-best GMM's for speaker identification. 2295-2298 - Guillaume Gravier, Chafic Mokbel, Gérard Chollet:
Model dependent spectral representations for speaker recognition. 2299-2302 - Roland Auckenthaler, John S. Mason:
Equalizing sub-band error rates in speaker recognition. 2303-2306 - Stefan Slomka, Sridha Sridharan:
Automatic gender identification under adverse conditions. 2307-2310 - Yizhar Lavner, Isak Gath, Judith Rosenhouse:
Acoustic features and perceptive processes in the identification of familiar voices. 2311-2314 - Leandro Rodríguez Liñares, Carmen García-Mateo:
On the use of acoustic segmentation in speaker identification. 2315-2318 - Herman J. M. Steeneken, David A. van Leeuwen:
Speaker recognition by humans and machines. 2319-2322 - Karsten Kumpf, Robin W. King:
Foreign speaker accent classification using phoneme-dependent accent discrimination models and comparisons with human perception benchmarks. 2323-2326 - Li Liu, Jialong He, Günther Palm:
A comparison of human and machine in speaker recognition. 2327-2330 - Simo M. A. Goddijn, Guus de Krom:
Evaluation of second language learners' pronunciation using hidden Markov models. 2331-2334 - Brian S. Eberman, Pedro J. Moreno:
Delta vector taylor series environment compensation for speaker recognition. 2335-2338 - Jonathan Hume:
Wavelet-like regression features in the cepstral domain for speaker recognition. 2339-2342 - Rathinavelu Chengalvarayan:
Minimum classification error linear regression (MCELR) for speaker adaptation using HMM with trend functions. 2343-2346 - Nikos Fakotakis, Kallirroi Georgila, Anastasios Tsopanoglou:
A continuous HMM text-independent speaker recognition system based on vowel spotting. 2347-2350 - Johan Koolwaaij, Lou Boves:
On the independence of digits in connected digit strings. 2351-2354 - Johan Koolwaaij, Lou Boves:
A new procedure for classifying speakers in speaker verification systems. 2355-2358 - Claude Montacié, Marie-José Caraty:
SOUND CHANNEL VIDEO INDEXING. 2359-2362 - Javier Hernando, Climent Nadeu:
CDHMM speaker recognition by means of frequency filtering of filter-bank energies. 2363-2366
Style and Accent Recognition
- Jason J. Humphries, Philip C. Woodland:
Using accent-specific pronunciation modelling for improved large vocabulary continuous speech recognition. 2367-2370 - Alexandros Potamianos, Shrikanth S. Narayanan, Sungbok Lee:
Automatic speech recognition for children. 2371-2374 - Carlos Teixeira, Isabel Trancoso, António Joaquim Serralheiro:
Recognition of non-native accents. 2375-2378 - Michael Finke, Alex Waibel:
Speaking mode dependent pronunciation modeling in large vocabulary conversational speech recognition. 2379-2382 - Elizabeth Shriberg, Rebecca A. Bates, Andreas Stolcke:
A prosody only decision-tree model for disfluency detection. 2383-2386 - Sahar E. Bou-Ghazale, John H. L. Hansen:
A novel training approach for improving speech recognition under adverse stressful conditions. 2387-2390
Towards Robust ASR for Car and Telephone Applications
- Luciano Fissore, Giorgio Micca, Claudio Vair:
Methods for microphone equalization in speech recognition. 2415-2418 - Satoshi Nakamura, Kiyohiro Shikano:
Room acoustics and reverberation: impact on hands-free recognition. 2419-2422 - Gérard Faucon, Régine Le Bouquin-Jeannès:
Echo and noise reduction for hands-free terminals - state of the art -. 2423-2426 - Reinhold Haeb-Umbach:
Robust speech recognition for wireless networks and mobile telephony. 2427-2430 - Dirk Van Compernolle:
Speech recognition in the car from phone dialing to car navigation. 2431-2434
Language-Specific Systems
- Briony Williams, Stephen Isard:
A keyvowel approach to the synthesis of regional accents of English. 2435-2438 - Attila Ferencz, Radu Arsinte, István Nagy, Teodora Ratiu, Maria Ferencz, Gavril Toderean, Diana Zaiu, Tunde-Csilla Kovács, Lajos Simon:
Experimental implementation of pitch-synchronous synthesis methods for the ROMVOX text-to-speech system. 2439-2442 - Bernd Möbius, Richard Sproat, Jan P. H. van Santen, Joseph P. Olive:
The bell labs German text-to-speech system: an overview. 2443-2446 - Susan Fitt:
The generation of regional pronunciations of English for speech synthesis. 2447-2450 - Elena Pavlova, Yuri Pavlov, Richard Sproat, Chilin Shih, Jan P. H. van Santen:
Bell laboratories Russian text-to-speech system. 2451-2454 - Antonio Bonafonte, Ignasi Esquerra, Albert Febrer, Francesc Vallverdú:
A bilingual text-to-speech system in Spanish and catalan. 2455-2458
Pronunciation Models
- Nick Cremelie, Jean-Pierre Martens:
Automatic rule-based generation of word pronunciation networks. 2459-2462 - Jose Maria Elvira, Juan Carlos Torrecilla, Javier Caminero:
Creating user defined new vocabularies for voice dialing. 2463-2466 - Mosur Ravishankar, Maxine Eskénazi:
Automatic generation of context-dependent pronunciations. 2467-2470 - Toshiaki Fukada, Yoshinori Sagisaka:
Automatic generation of a pronunciation dictionary based on a pronunciation network. 2471-2474 - Uwe Jost, Henrik Heine, Gunnar Evermann:
What is wrong with the lexicon - an attempt to model pronunciations probabilistically. 2475-2478 - Kevin L. Markey, Wayne H. Ward:
Lexical tuning based on triphone confidence estimation. 2479-2482
Auditory Modelling and Psychoacoustics
- Frédéric Berthommier, Georg F. Meyer:
Improving of amplitude modulation maps for F0-dependent segregation of harmonic sounds. 2483-2486 - Reinier Kortekaas, Armin Kohlrausch:
Psychophysical evaluation of PSOLA: natural versus synthetic speech. 2487-2490 - Valentina V. Lublinskaja, Inna V. Koroleva, A. N. Kornev, Elena V. Iagounova:
Perception of noised words by normal children and children with speech and language impairments. 2491-2494 - Georg F. Meyer, William A. Ainsworth:
Modelling the perception of simultaneous semi-vowels. 2495-2498 - Fernando Perdigão, Luís Sá:
Properties of Auditory Model Representations. 2499-2502 - Eduardo Sá Marta, Luís Vieira de Sá:
Impact of "ascending sequence" AI (auditory primary cortex) cells on stop consonant perception. 2503-2506
Voice Conversion and Data Driven F0-Models
- Jan P. H. van Santen:
Combinatorial issues in text-to-speech synthesis. 2507-2510 - Olivier Boëffard, Françoise Emerard:
Application-dependent prosodic models for text-to-speech synthesis and automatic design of learning database corpus using genetic algorithm. 2511-2514 - Eduardo López Gonzalo, Jose M. Rodriguez-Garcia, Luis A. Hernández Gómez, Juan Manuel Villar-Navarro:
Automatic corpus-based training of rules for prosodic generation in text-to-speech. 2515-2518 - Eun-Kyoung Kim, Sangho Lee, Yung-Hwan Oh:
Hidden Markov model based voice conversion using dynamic characteristics of speaker. 2519-2522 - Takayoshi Yoshimura, Takashi Masuko, Keiichi Tokuda, Takao Kobayashi, Tadashi Kitamura:
Speaker interpolation in HMM-based speech synthesis system. 2523-2526 - Vassilios Darsinos, Dimitrios Galanis, George Kokkinakis:
Designing a speaker adaptable formant-based text-to-speech system. 2527-2530
Vocal Tract Analysis
- Petros Maragos, Alexandros Potamianos:
On using fractal features of speech sounds in automatic speech recognition. 2531-2534 - Hywel B. Richards, John S. Bridle, Melvyn J. Hunt, John S. Mason:
Dynamic constraint weighting in the context of articulatory parameter estimation. 2535-2538 - Minkyu Lee, Donald G. Childers:
Estimation of vocal tract front cavity resonance in unvoiced fricative speech. 2539-2542 - António J. S. Teixeira, Francisco A. C. Vaz, José Carlos Príncipe:
A software tool to study portuguese vowels. 2543-2546 - Jean Schoentgen, Sorin Ciocea:
Post-synchronization via formant-to-area mapping of asynchronously recorded speech signals and area functions. 2547-2550 - Zhenli Yu, P. C. Ching:
Geometrically and acoustically optimized codebook for unique mapping from formants to vocal-tract shape. 2551-2554
F0 and Duration Modelling, Spoken language processing
- Marcel Riedi:
Modeling segmental duration with multivariate adaptive regression splines. 2627-2630 - Fabrice Malfrère, Thierry Dutoit:
High-quality speech synthesis for phonetic speech segmentation. 2631-2634 - Nick Campbell, Yoshiharu Itoh, Wen Ding, Norio Higuchi:
Factors affecting perceived quality and intelligibility in the CHATR concatenative speech synthesiser. 2635-2638 - Christoph Neukirchen, Daniel Willett, Gerhard Rigoll:
Reduced lexicon trees for decoding in a MMIi-connectionist/HMM speech recognition system. 2639-2642 - Jean Véronis, Philippe Di Cristo, Fabienne Courtois, Benoit Lagrue:
A stochastic model of intonation for French text-to-speech synthesis. 2643-2646 - Angelien A. Sanderman, René Collier:
Phonetic rules for a phonetic-to-speech system. 2647-2650 - Jan P. H. van Santen, Chilin Shih, Bernd Möbius, Evelyne Tzoukermann, Michael Tanenblatt:
Multi-lingual duration modeling. 2651-2654 - Plínio A. Barbosa:
A model of segment (and pause) duration generation for Brazilian Portuguese text-to-speech synthesis. 2655-2658 - Ariane Halber, David Roussel:
Parsing strategy for spoken language interfaces with a lexicalized tree grammar. 2659-2662 - Jan W. Amtrup, Henrik Heine, Uwe Jost:
What's in a word graph evaluation and enhancement of word lattices? 2663-2666 - Christoph Tillmann, Stephan Vogel, Hermann Ney, Alex Zubiaga, Hassan Sawaf:
Accelerated DP based search for statistical translation. 2667-2670 - Ken Fujisawa, Toshio Hirai, Norio Higuchi:
Use of pitch pattern improvement in the CHATR speech synthesis system. 2671-2674 - Gerald Corrigan, Noel Massey, Orhan Karaali:
Generating segment durations in a text-zo-speech system: a hybrid rule-based/neural network approach. 2675-2678 - Yasushi Ishikawa, Takashi Ebihara:
On the global FO shape model using a transition network for Japanese text-to-speech systems. 2679-2682 - José Colás, Juan Manuel Montero, Javier Ferreiros, José Manuel Pardo:
An alternative and flexible approach in robust information retrieval systems. 2683-2686 - Keiko Horiguchi, Alexander Franz:
A probabilistic approach to analogical speech translation. 2687-2690 - Marie-José Caraty, Claude Montacié, Fabrice Lefèvre:
Dynamic lexicon for a very large vocabulary vocal dictation. 2691-2694
Language Modelling
- Encarna Segarra, Lluís F. Hurtado:
Construction of language models using the morphic generator grammatical inference (MGGI) methodology. 2695-2698 - Shuwu Zhang, Taiyi Huang:
An integrated language modeling with n-gram model and WA model for speech recognition. 2699-2702 - Ye-Yi Wang, Alex Waibel:
Statistical analysis of dialogue structure. 2703-2706 - Philip Clarkson, Ronald Rosenfeld:
Statistical language modeling using the CMU-cambridge toolkit. 2707-2710 - Gilles Adda, Martine Adda-Decker, Jean-Luc Gauvain, Lori Lamel:
Text normalization and speech recognition in French. 2711-2714 - Géraldine Damnati, Jacques Simonin:
A novel tree-based clustering algorithm for statistical language modeling. 2715-2718 - Shoichi Matsunaga, Shigeki Sagayama:
Variable-length language modeling integrating global constraints. 2719-2722 - Kamel Smaïli, Imed Zitouni, François Charpillet, Jean Paul Haton:
An hybrid language model for a continuous dictation prototype. 2723-2726 - Guy Perennou, L. Pousse:
Dealing with pronunciation variants at the language model level for the continuous automatic speech recognition of French. 2727-2730 - Ernst Günter Schukat-Talamazzini, Florian Gallwitz, Stefan Harbeck, Volker Warnke:
Rational interpolation of maximum likelihood predictors in stochastic language modeling. 2731-2734 - Akinori Ito, Hideyuki Saitoh, Masaharu Katoh, Masaki Kohda:
N-gram language model adaptation using small corpus for spoken dialog recognition. 2735-2738 - Man-Hung Siu, Mari Ostendorf:
Variable n-gram language modeling and extensions for conversational speech. 2739-2742 - Petra Geutner:
Fuzzy class rescoring: a part-of-speech language model. 2743-2746 - Akito Nagai, Yasushi Ishikawa:
Speech understanding based on integrating concepts by conceptual dependency. 2747-2750 - Fabio Brugnara, Marcello Federico:
Dynamic language models for interactive speech applications. 2751-2754 - George Demetriou, Eric Atwell, Clive Souter:
Large-scale lexical semantics for speech recognition support. 2755-2758 - Hajime Tsukada, Hirofumi Yamamoto, Yoshinori Sagisaka:
Integration of grammar and statistical language constraints for partial word-sequence recognition. 2759-2762 - Paul Taylor, Simon King, Stephen Isard, Helen Wright, Jacqueline C. Kowtko:
Using intonation to constrain language models in speech recognition. 2763-2766 - Peter A. Heeman, James F. Allen:
Incorporating POS tagging into language modeling. 2767-2770 - C. Uhrik, W. Ward:
Confidence metrics based on n-gram language model backoff behaviors. 2771-2774 - Ciprian Chelba, David Engle, Frederick Jelinek, Victor Jimenez, Sanjeev Khudanpur, Lidia Mangu, Harry Printz, Eric Ristad, Ronald Rosenfeld, Andreas Stolcke, Dekai Wu:
Structure and performance of a dependency language model. 2775-2778 - Andreas Stolcke:
Modeling linguistic segment and turn boundaries for n-best rescoring of spontaneous speech. 2779-2782 - P. E. Kenne, Mary O'Kane:
Hybrid language models: is simpler better? 2783-2786 - Thorsten Brants:
Internal and external tagsets in part-of-speech tagging. 2787-2790
Auditory Modelling and Psychoacoustics, Neural Networks for Speech Processing and Recognition
- Laurent Varin, Frédéric Berthommier:
A probabilistic model of double-vowel segregation. 2791-2794 - Habibzadeh V. Houshang, Shigeyoshi Kitazawa:
Stimulus signal estimation from auditory-neural transduction inverse processing. 2795-2798 - Chakib Tadj, Pierre Dumouchel, Franck Poirier:
FDVQ based keyword spotter which incorporates a semi-supervised learning for primary processing. 2799-2802 - V. Lublinskaja, Christian Sappok:
The initial time Span of auditory processing used for speaker attribution of the speech signal. 2803-2806 - Nikko Ström:
Sparse connection and pruning in large dynamic artificial neural networks. 2807-2810 - Roxana Teodorescu, Dirk Van Compernolle, Ioannis Dologlou:
A modular initialization scheme for better speech recognition performance using hybrid systems of MLPs/HMMs. 2811-2814 - Tatiana V. Chernigovskaya:
Lateralization for auditory perception of foreign words. 2815-2818 - Yuri A. Kosarev, Pavel Jarov, Alexander Osipov:
The structural weighted sets method for continuous speech and text recognition. 2819-2822 - Christian John Sumner, Duncan F. Gillies:
Lateral inhibitory networks for auditory processing. 2823-2826 - Henning Reetz:
Missing fundamentals: a problem of auditory or mental processing? 2827-2830 - Felix Freitag, Enric Monte, Josep M. Salavedra:
Predictive neural networks applied to phoneme recognition. 2831-2834 - Suhardi, Klaus Fellbaum:
Empirical comparison of two multilayer perceptron-based keyword speech recognition algorithms. 2835-2838 - Toshiaki Fukada, Sophie Aveline, Mike Schuster, Yoshinori Sagisaka:
Segment boundary estimation using recurrent neural networks. 2839-2842 - Mike Schuster:
Incorporation of HMM output constraints in hybrid NN/HMM systems during training. 2843-2846 - Ludmila Babkina, Sergey Koval, Alexander Molchanov:
Principles of the hearing periphery functioning in new methods of pitch detection and speech enhancement. 2847-2850 - Christine Meunier, Alain Content, Ulrich H. Frauenfelder, Ruth Kearns:
The locus of the syllable effect: prelexical or lexical? 2851-2854 - Robin J. Lickley, Ellen Gurman Bard:
On not remembering disfluencies. 2855-2858 - T. Andringa:
Using an auditory model and leaky autocorrelators to tune in to speech. 2859-2862
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.