default search action
ICSLP 1998: Sydney, Australia
- The 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November - 4th December 1998. ISCA 1998
- Graeme M. Clark:
Cochlear implants in the second and third millennia. - Stephanie Seneff:
The use of linguistic hierarchies in speech understanding.
Text-To-Speech Synthesis 1-6
- Paul C. Bagshaw:
Unsupervised training of phone duration and energy models for text-to-speech synthesis. - Jerome R. Bellegarda, Kim E. A. Silverman:
Improved duration modeling of English phonemes using a root sinusoidal transformation. - Chilin Shih, Wentao Gu, Jan P. H. van Santen:
Efficient adaptation of TTS duration model to new speakers. - Takayoshi Yoshimura, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura:
Duration modeling for HMM-based speech synthesis. - Cameron S. Fordyce, Mari Ostendorf:
Prosody prediction for speech synthesis using transformational rule-based learning. - Susan Fitt, Stephen Isard:
Representing the environments for phonological processes in an accent-independent lexicon for synthesis of English. - Daniel Faulkner, Charles Bryant:
Efficient lexical retrieval for English text-to-speech synthesis. - Robert E. Donovan, Ellen Eide:
The IBM trainable speech synthesis system. - Sarah Hawkins, Jill House, Mark A. Huckvale, John Local, Richard Ogden:
Prosynth: an integrated prosodic approach to device-independent, natural-sounding speech synthesis. - Jialu Zhang, Shiwei Dong, Ge Yu:
Total quality evaluation of speech synthesis systems. - Gerit P. Sonntag, Thomas Portele:
Comparative evaluation of synthetic prosody with the PURR method. - Richard Sproat, Andrew J. Hunt, Mari Ostendorf, Paul Taylor, Alan W. Black, Kevin A. Lenzo, Mike Edgington:
SABLE: a standard for TTS markup. - H. Timothy Bunnell, Steven R. Hoskins, Debra Yarrington:
Prosodic vs. segmental contributions to naturalness in a diphone synthesizer. - Alex Acero:
A mixed-excitation frequency domain model for time-scale pitch-scale modification of speech. - Masami Akamine, Takehiko Kagoshima:
Analytic generation of synthesis units by closed loop training for totally speaker driven text to speech system (TOS drive TTS). - Martti Vainio, Toomas Altosaar:
Modeling the microprosody of pitch and loudness for speech synthesis with neural networks. - David T. Chappell, John H. L. Hansen:
Spectral smoothing for concatenative speech synthesis. - Aimin Chen, Saeed Vaseghi, Charles Ho:
MIMIC : a voice-adaptive phonetic-tree speech synthesiser. - Je Hun Jeon, Sunhwa Cha, Minhwa Chung, Jun Park, Kyuwoong Hwang:
Automatic generation of Korean pronunciation variants by multistage applications of phonological rules. - Stephen Cox, Richard Brady, Peter Jackson:
Techniques for accurate automatic annotation of speech waveforms. - Andrew Cronk, Michael W. Macon:
Optimized stopping criteria for tree-based unit selection in concatenative synthesis. - Stéphanie de Tournemire:
Automatic transcription of intonation using an identified prosodic alphabet. - Ignasi Esquerra, Albert Febrer, Climent Nadeu:
Frequency analysis of phonetic units for concatenative synthesis in catalan. - Alex Chengyu Fang, Jill House, Mark A. Huckvale:
Investigating the syntactic characteristics of English tone units. - Antonio Bonafonte, Ignasi Esquerra, Albert Febrer, José A. R. Fonollosa, Francesc Vallverdú:
The UPC text-to-speech system for Spanish and catalan. - Attila Ferencz, István Nagy, Tunde-Csilla Kovács, Maria Ferencz, Teodora Ratiu:
The new version of the ROMVOX text-to-speech synthesis system based on a hybrid time domain-LPC synthesis technique. - Takehiko Kagoshima, Masahiro Morita, Shigenobu Seto, Masami Akamine:
An F0 contour control model for totally speaker driven text to speech system. - Keikichi Hirose, Hiromichi Kawanami:
On the relationship of speech rates with prosodic units in dialogue speech. - Esther Klabbers, Raymond N. J. Veldhuis:
On the reduction of concatenation artefacts in diphone synthesis. - Chih-Chung Kuo, Kun-Yuan Ma:
Error analysis and confidence measure of Chinese word segmentation. - Jungchul Lee, Donggyu Kang, Sanghoon Kim, Koengmo Sung:
Energy contour generation for a sentence using a neural network learning method. - Yong-Ju Lee, Sook-Hyang Lee, Jong-Jin Kim, Hyun-Ju Ko, Young-Il Kim, Sanghun Kim, Jung-Cheol Lee:
A computational algorithm for F0 contour generation in Korean developed with prosodically labeled databases using k-toBI system. - Kevin A. Lenzo, Christopher Hogan, Jeffrey Allen:
Rapid-deployment text-to-speech in the DIPLOMAT system. - Robert H. Mannell:
Formant diphone parameter extraction utilising a labelled single-speaker database. - Osamu Mizuno, Shin'ya Nakajima:
A new synthetic speech/sound control language. - Ryo Mochizuki, Yasuhiko Arai, Takashi Honda:
A study on the natural-sounding Japanese phonetic word synthesis by using the VCV-balanced word database that consists of the words uttered forcibly in two types of pitch accent. - Vincent Pagel, Kevin A. Lenzo, Alan W. Black:
Letter to sound rules for accented lexicon compression. - Ze'ev Roth, Judith Rosenhouse:
A name announcement algorithm with memory size and computational power constraints. - Frédérique Sannier, Rabia Belrhali, Véronique Aubergé:
How a French TTS system can describe loanwords. - Tomaz Sef, Ales Dobnikar, Matjaz Gams:
Improvements in slovene text-to-speech synthesis. - Shigenobu Seto, Masahiro Morita, Takehiko Kagoshima, Masami Akamine:
Automatic rule generation for linguistic features analysis using inductive learning technique: linguistic features analysis in TOS drive TTS system. - Yoshinori Shiga, Hiroshi Matsuura, Tsuneo Nitta:
Segmental duration control based on an articulatory model. - Evelyne Tzoukermann:
Text analysis for the bell labs French text-to-speech system. - Jennifer J. Venditti, Jan P. H. van Santen:
Modeling vowel duration for Japanese text-to-speech synthesis. - Ren-Hua Wang, Qingfeng Liu, Yongsheng Teng, Deyu Xia:
Towards a Chinese text-to-speech system with higher naturalness. - Andrew P. Breen, Peter Jackson:
A phonologically motivated method of selecting non-uniform units. - Steve Pearson, Nick Kibre, Nancy Niedzielski:
A synthesis method based on concatenation of demisyllables and a residual excited vocal tract model. - Ann K. Syrdal, Alistair Conkie, Yannis Stylianou:
Exploration of acoustic correlates in speaker selection for concatenative synthesis. - Johan Wouters, Michael W. Macon:
A perceptual evaluation of distance measures for concatenative speech synthesis. - Mike Plumpe, Alex Acero, Hsiao-Wuen Hon, Xuedong Huang:
HMM-based smoothing for concatenative speech synthesis. - Martin Holzapfel, Nick Campbell:
A nonlinear unit selection strategy for concatenative speech synthesis based on syllable level features. - Robert Eklund, Anders Lindström:
How to handle "foreign" sounds in Swedish text-to-speech conversion: approaching the 'xenophone' problem. - Nick Campbell:
Multi-lingual concatenative speech synthesis. - Takashi Saito:
On the use of F0 features in automatic segmentation for speech synthesis. - Atsuhiro Sakurai, Takashi Natsume, Keikichi Hirose:
A linguistic and prosodic database for data-driven Japanese TTS synthesis. - Alexander Kain, Michael W. Macon:
Text-to-speech voice adaptation from sparse training data. - Gregor Möhler:
Describing intonation with a parametric model.
Spoken Language Models and Dialog 1-5
- Joakim Gustafson, Patrik Elmberg, Rolf Carlson, Arne Jönsson:
An educational dialogue system with a user controllable dialogue manager. - Klaus Failenschmid, J. H. Simon Thornton:
End-user driven dialogue system design: the reward experience. - Yi-Chung Lin, Tung-Hui Chiang, Huei-Ming Wang, Chung-Ming Peng, Chao-Huang Chang:
The design of a multi-domain Mandarin Chinese spoken dialogue system. - Kallirroi Georgila, Anastasios Tsopanoglou, Nikos Fakotakis, George Kokkinakis:
An integrated dialogue system for the automation of call centre services. - Kuansan Wang:
An event driven model for dialogue systems. - Cosmin Popovici, Paolo Baggia, Pietro Laface, Loreta Moisa:
Automatic classification of dialogue contexts for dialogue predictions. - Ganesh N. Ramaswamy, Jan Kleindienst:
Automatic identification of command boundaries in a conversational natural language user interface. - Massimo Poesio, Andrei Mikheev:
The predictive power of game structure in dialogue act recognition: experimental results using maximum entropy estimation. - Paul C. Constantinides, Scott Hansma, Chris Tchou, Alexander I. Rudnicky:
A schema based approach to dialog control. - Gregory Aist:
Expanding a time-sensitive conversational architecture for turn-taking to handle content-driven interruption. - Marc Swerts, Hanae Koiso, Atsushi Shimojima, Yasuhiro Katagiri:
On different functions of repetitive utterances. - Hiroaki Noguchi, Yasuharu Den:
Prosody-based detection of the context of backchannel responses. - Lena Strömbäck, Arne Jönsson:
Robust interpretation for spoken dialogue systems. - Yohei Okato, Keiji Kato, Mikio Yamamoto, Shuichi Itahashi:
System-user interaction and response strategy in spoken dialogue system. - Noriko Suzuki, Kazuo Ishii, Michio Okada:
Organizing self-motivated dialogue with autonomous creatures. - Gerhard Hanrieder, Paul Heisterkamp, Thomas Brey:
Fly with the EAGLES: evaluation of the "ACCeSS" spoken language dialogue system. - Maria Aretoulaki, Stefan Harbeck, Florian Gallwitz, Elmar Nöth, Heinrich Niemann, Jozef Ivanecký, Ivo Ipsic, Nikola Pavesic, Václav Matousek:
SQEL: a multilingual and multifunctional dialogue system. - Stefan Kaspar, Achim G. Hoffmann:
Semi-automated incremental prototyping of spoken dialog systems. - Peter A. Heeman, Michael Johnston, Justin Denney, Edward C. Kaiser:
Beyond structured dialogues: factoring out grounding. - Masahiro Araki, Shuji Doshita:
A robust dialogue model for spoken dialogue processing. - Tom Brøndsted, Bo Nygaard Bai, Jesper Østergaard Olsen:
The REWARD service creation environment. an overview. - Matthew Bull, Matthew P. Aylett:
An analysis of the timing of turn-taking in a corpus of goal-oriented dialogue. - Sarah Davies, Massimo Poesio:
The provision of corrective feedback in a spoken dialogue CALL system. - Laurence Devillers, Hélène Bonneau-Maynard:
Evaluation of dialog strategies for a tourist information retrieval system. - Sadaoki Furui, Koh'ichiro Yamaguchi:
Designing a multimodal dialogue system for information retrieval. - Dinghua Guan, Min Chu, Quan Zhang, Jian Liu, Xiangdong Zhang:
The research project of man-computer dialogue system in Chinese. - Kate S. Hone, David Golightly:
Interfaces for speech recognition systems: the impact of vocabulary constraints and syntax on performance. - Tatsuya Iwase, Nigel Ward:
Pacing spoken directions to suit the listener. - Annika Flycht-Eriksson, Arne Jönsson:
A spoken dialogue system utilizing spatial information. - Candace A. Kamm, Diane J. Litman, Marilyn A. Walker:
From novice to expert: the effect of tutorials on user expertise with spoken dialogue systems. - Takeshi Kawabata:
Emergent computational dialogue management architecture for task-oriented spoken dialogue systems. - Tadahiko Kumamoto, Akira Ito:
An analysis of dialogues with our dialogue system through a WWW page. - Michael F. McTear:
Modelling spoken dialogues with state transition diagrams: experiences with the CSLU toolkit. - Michio Okada, Noriko Suzuki, Jacques M. B. Terken:
Situated dialogue coordination for spoken dialogue systems. - Xavier Pouteau, Luis Arévalo:
Robust spoken dialogue systems for consumer products: a concrete application. - Daniel Willett, Arno Romer, Jörg Rottland, Gerhard Rigoll:
A German dialogue system for scheduling dates and meetings by naturally spoken continuous speech. - Chung-Hsien Wu, Gwo-Lang Yan, Chien-Liang Lin:
Spoken dialogue system using corpus-based hidden Markov model. - Peter J. Wyard, Gavin E. Churcher:
A realistic wizard of oz simulation of a multimodal spoken language system. - Yen-Ju Yang, Lin-Shan Lee:
A syllable-based Chinese spoken dialogue system for telephone directory services primarily trained with a corpus. - Hiroyuki Yano, Akira Ito:
How disagreement expressions are used in cooperative tasks.
Prosody and Emotion 1-6
- Phil Rose:
Tones of a tridialectal: acoustic and perceptual data on ten linguistic tonetic contrasts between lao, nyo and standard Thai. - Napier Guy Ian Thompson:
Tone sandhi between complex tones in a seven-tone southern Thai dialect. - Alexander Robertson Coupe:
The acoustic and perceptual features of tone in the tibeto-burman language ao naga. - Phil Rose:
The differential status of semivowels in the acoustic phonetic realisation of tone. - Kai Alter, Karsten Steinhauer, Angela D. Friederici:
De-accentuation: linguistic environments and prosodic realizations. - N. Amir, S. Ron:
Towards an automatic classification of emotions in speech. - Marc Schröder, Véronique Aubergé, Marie-Agnès Cathiard:
Can we hear smile? - Matthew P. Aylett, Matthew Bull:
The automatic marking of prominence in spontaneous speech using duration and part of speech information. - JongDeuk Kim, SeongJoon Baek, Myung-Jin Bae:
On a pitch alteration technique in excited cepstral spectrum for high quality TTS. - Jan Buckow, Anton Batliner, Richard Huber, Elmar Nöth, Volker Warnke, Heinrich Niemann:
Dovetailing of acoustics and prosody in spontaneous speech recognition. - Janet E. Cahn:
A computational memory and processing model for prosody. - Belinda Collins:
Convergence of fundamental frequencies in conversation: if it happens, does it matter? - Hiroya Fujisaki, Sumio Ohno, Takashi Yagi, Takeshi Ono:
Analysis and interpretation of fundamental frequency contours of british English in terms of a command-response model. - Frode Holm, Kazue Hata:
Common patterns in word level prosody. - Yasuo Horiuchi, Akira Ichikawa:
Prosodic structure in Japanese spontaneous speech. - Shunichi Ishihara:
An acoustic-phonetic description of word tone in kagoshima Japanese. - Koji Iwano, Keikichi Hirose:
Representing prosodic words using statistical models of moraic transition of fundamental frequency contours of Japanese. - Tae-Yeoub Jang, Minsuck Song, Kiyeong Lee:
Disambiguation of Korean utterances using automatic intonation recognition. - Oliver Jokisch, Diane Hirschfeld, Matthias Eichner, Rüdiger Hoffmann:
Multi-level rhythm control for speech synthesis using hybrid data driven and rule-based approaches. - Jiangping Kong:
EGG model of ditoneme in Mandarin. - Geetha Krishnan, Wayne H. Ward:
Temporal organization of speech for normal and fast rates. - Haruo Kubozono:
A syllable-based generalization of Japanese accentuation. - Hyuck-Joon Lee:
Non-adjacent segmental effects in tonal realization of accentual phrase in seoul Korean. - Eduardo López, Javier Caminero, Ismael Cortázar, Luis A. Hernández Gómez:
Improvement on connected numbers recognition using prosodic information. - Kazuaki Maeda, Jennifer J. Venditti:
Phonetic investigation of boundary pitch movements in Japanese. - Kikuo Maekawa:
Phonetic and phonological characteristics of paralinguistic information in spoken Japanese. - Arman Maghbouleh:
ToBI accent type recognition. - Hansjörg Mixdorff, Hiroya Fujisaki:
The influence of syllable structure on the timing of intonational events in German. - Osamu Mizuno, Shin'ya Nakajima:
New prosodic control rules for expressive synthetic speech. - Mitsuru Nakai, Hiroshi Shimodaira:
The use of F0 reliability function for prosodic command analysis on F0 contour generation model. - Sumio Ohno, Hiroya Fujisaki, Hideyuki Taguchi:
Analysis of effects of lexical accent, syntax, and global speech rate upon the local speech rate. - Sumio Ohno, Hiroya Fujisaki, Yoshikazu Hara:
On the effects of speech rate upon parameters of the command-response model for the fundamental frequency contours of speech. - Thomas Portele, Barbara Heuft:
The maximum-based description of F0 contours and its application to English. - Thomas Portele:
Perceived prominence and acoustic parameters in american English. - Erhard Rank, Hannes Pirker:
Generating emotional speech with a concatenative synthesizer. - Albert Rilliard, Véronique Aubergé:
A perceptive measure of pure prosody linguistic functions with reiterant sentences. - Kazuhito Koike, Hirotaka Suzuki, Hiroaki Saito:
Prosodic parameters in emotional speech. - Barbertje M. Streefkerk, Louis C. W. Pols, Louis ten Bosch:
Automatic detection of prominence (as defined by listeners' judgements) in read aloud dutch sentences. - Masafumi Tamoto, Takeshi Kawabata:
A schema for illocutionary act identification with prosodic feature. - Wataru Tsukahara:
An algorithm for choosing Japanese acknowledgments using prosodic cues and context. - Chao Wang, Stephanie Seneff:
A study of tones and tempo in continuous Mandarin digit strings and their application in telephone quality speech recognition. - Sandra P. Whiteside:
Simulated emotions: an acoustic study of voice and perturbation measures. - Jinsong Zhang, Keikichi Hirose:
A robust tone recognition method of Chinese based on sub-syllabic F0 contours. - Xiaonong Sean Zhu:
The microprosodics of tone sandhi in shanghai disyllabic compounds. - Natalija Bolfan-Stosic, Tatjana Prizl:
Jitter and shimmer differences between pathological voices of school children. - Xiaonong Sean Zhu:
What spreads, and how? tonal rightward spreading on shanghai disyllabic compounds. - Sean Zhu, Phil Rose:
Tonal complexity as a dialectal feature: 25 different citation tones from four zhejiang wu dialects. - Juan Manuel Montero, Juana M. Gutiérrez-Arriola, Sira E. Palazuelos, Emilia Enríquez, Santiago Aguilera, José Manuel Pardo:
Emotional speech synthesis: from speech database to TTS. - Cécile Pereira, Catherine I. Watson:
Some acoustic characteristics of emotion. - Marc Swerts:
Intonative structure as a determinant of word order variation in dutch verbal endgroups. - Johanneke Caspers:
Experiments on the meaning of two pitch accent types: the 'pointed hat' versus the accent-lending fall in dutch. - Sun-Ah Jun, Hyuck-Joon Lee:
Phonetic and phonological markers of contrastive focus in Korean. - Emiel Krahmer, Marc Swerts:
Reconciling two competing views on contrastiveness. - Paul Taylor:
The tilt intonation model. - Hiroya Fujisaki, Sumio Ohno, Seiji Yamada:
Analysis of occurrence of pauses and their durations in Japanese text reading. - Estelle Campione, Jean Véronis:
A statistical study of pitch target points in five languages. - Fabrice Malfrère, Thierry Dutoit, Piet Mertens:
Fully automatic prosody generator for text-to-speech. - Halewijn Vereecken, Jean-Pierre Martens, Cynthia Grover, Justin Fackrell, Bert Van Coile:
Automatic prosodic labeling of 6 languages. - Helen Wright:
Automatic utterance type detection using suprasegmental features. - Ee Ling Low, Esther Grabe:
A contrastive study of lexical stress placement in singapore English and british English. - Florian Gallwitz, Anton Batliner, Jan Buckow, Richard Huber, Heinrich Niemann, Elmar Nöth:
Integrated recognition of words and phrase boundaries. - Amalia Arvaniti:
Phrase accents revisited: comparative evidence from standard and cypriot greek. - Grzegorz Dogil, Gregor Möhler:
Phonetic invariance and phonological stability: lithuanian pitch accents. - Christel Brindöpke, Gernot A. Fink, Franz Kummert, Gerhard Sagerer:
A HMM-based recognition system for perceptive relevant pitch movements of spontaneous German speech. - Jean Véronis, Estelle Campione:
Towards a reversible symbolic coding of intonation.
Hidden Markov Model Techniques 1-3
- Xiaoqiang Luo, Frederick Jelinek:
Nonreciprocal data sharing in estimating HMM parameters. - Jeff A. Bilmes:
Data-driven extensions to HMM statistical dependencies. - Jiping Sun, Li Deng:
Use of high-level linguistic constraints for constructing feature-based phonological model in speech recognition. - Steven C. Lee, James R. Glass:
Real-time probabilistic segmentation for segment-based speech recognition. - Guillaume Gravier, Marc Sigelle, Gérard Chollet:
Toward Markov random field modeling of speech. - Rukmini Iyer, Herbert Gish, Man-Hung Siu, George Zavaliagkos, Spyros Matsoukas:
Hidden Markov models for trajectory modeling. - Katsura Aizawa, Chieko Furuichi:
A statistical phonemic segment model for speech recognition based on automatic phonemic segmentation. - Kris Demuynck, Jacques Duchateau, Dirk Van Compernolle, Patrick Wambacq:
Improved feature decorrelation for HMM-based speech recognition. - Johan A. du Preez, David M. Weber:
Efficient high-order hidden Markov modelling. - Ellen Eide, Lalit R. Bahl:
A time-synchronous, tree-based search strategy in the acoustic fast match of an asynchronous speech recognition system. - Jürgen Fritsch, Michael Finke, Alex Waibel:
Effective structural adaptation of LVCSR systems to unseen domains using hierarchical connectionist acoustic models. - Aravind Ganapathiraju, Jonathan Hamaker, Joseph Picone:
Support vector machines for speech recognition. - Malan B. Gandhi:
Natural number recognition using discriminatively trained inter-word context dependent hidden Markov models. - Jonathan Hamaker, Aravind Ganapathiraju, Joseph Picone:
Information theoretic approaches to model selection. - Kengo Hanai, Kazumasa Yamamoto, Nobuaki Minematsu, Seiichi Nakagawa:
Continuous speech recognition using segmental unit input HMMs with a mixture of probability density functions and context dependency. - Jacques Simonin, Lionel Delphin-Poulat, Géraldine Damnati:
Gaussian density tree structure in a multi-Gaussian HMM-based speech recognition system. - Hiroaki Kojima, Kazuyo Tanaka:
Generalized phone modeling based on piecewise linear segment lattice. - Ryosuke Koshiba, Mitsuyoshi Tachimori, Hiroshi Kanazawa:
A flexible method of creating HMM using block-diagonalization of covariance matrices. - Cristina Chesta, Pietro Laface, Franco Ravera:
HMM topology selection for accurate acoustic and duration modeling. - Tan Lee, Rolf Carlson, Björn Granström:
Context-dependent duration modelling for continuous speech recognition. - Brian Mak, Enrico Bocchieri:
Training of context-dependent subspace distribution clustering hidden Markov model. - Cesar Martín del Alamo, Luis Villarrubia, Francisco Javier González, Luis A. Hernández Gómez:
Unsupervised training of HMMs with variable number of mixture components per state. - Máté Szarvas, Shoichi Matsunaga:
Acoustic observation context modeling in segment based speech recognition. - Ji Ming, Philip Hanna, Darryl Stewart, Saeed Vaseghi, Francis Jack Smith:
Capturing discriminative information using multiple modeling techniques. - Laurence Molloy, Stephen Isard:
Suprasegmental duration modelling with elastic constraints in automatic speech recognition. - Albino Nogueiras Rodríguez, José B. Mariño, Enric Monte:
An adaptive gradient-search based algorithm for discriminative training of HMM's. - Albino Nogueiras Rodríguez, José B. Mariño:
Task adaptation of sub-lexical unit models using the minimum confusibility criterion on task independent databases. - Gordon Ramsay:
Stochastic calculus, non-linear filtering, and the internal model principle: implications for articulatory speech recognition. - Christian Wellekens, Jussi Kangasharju, Cedric Milesi:
The use of meta-HMM in multistream HMM training for automatic speech recognition. - Christian Wellekens:
Enhanced ASR by acoustic feature filtering. - Christoph Neukirchen, Daniel Willett, Gerhard Rigoll:
Soft state-tying for HMM-based speech recognition. - Silke M. Witt, Steve J. Young:
Estimation of models for non-native speech in computer-assisted language learning based on linear model combination. - Tae-Young Yang, Ji-Sung Kim, Chungyong Lee, Dae Hee Youn, Il-Whan Cha:
Duration modeling using cumulative duration probability and speaking rate compensation. - Geoffrey Zweig, Stuart Russell:
Probabilistic modeling with Bayesian networks for automatic speech recognition.
Speaker and Language Recognition 1-4
- Perasiriyan Sivakumaran, Aladdin M. Ariyaeeinia, Jill A. Hewitt:
Sub-band based speaker verification using dynamic recombination weights. - Michael Barlow, Michael Wagner:
Measuring the dynamic encoding of speaker identity and dialect in prosodic parameters. - Nicole Beringer, Florian Schiel, Peter Regel-Brietzmann:
German regional variants - a problem for automatic speech recognition? - Kay Berkling, Marc A. Zissman, Julie Vonwiller, Christopher Cleirigh:
Improving accent identification through knowledge of English syllable structure. - Zinny S. Bond, Donald Fucci, Verna Stockmal, Douglas McColl:
Multi-dimensional scaling of listener responses to complex auditory stimuli. - Verna Stockmal, Danny R. Moates, Zinny S. Bond:
Same talker, different language. - Susanne Burger, Daniela Oppermann:
The impact of regional variety upon specific word categories in spontaneous German. - Dominique Genoud, Gérard Chollet:
Speech pre-processing against intentional imposture in speaker recognition. - Mike Lincoln, Stephen Cox, Simon Ringland:
A comparison of two unsupervised approaches to accent identification. - Dominik R. Dersch, Christopher Cleirigh, Julie Vonwiller:
The influence of accents in australian English vowels and their relation to articulatory tract parameters. - Johan A. du Preez, David M. Weber:
Automatic language recognition using high-order HMMs. - Marcos Faúndez-Zanuy, Daniel Rodriguez-Porcheron:
Speaker recognition using residual signal of linear and nonlinear prediction models. - Yong Gu, Trevor Thomas:
An implementation and evaluation of an on-line speaker verification system for field trials. - Javier Hernando, Climent Nadeu:
Speaker verification on the polycost database using frequency filtered spectral energies. - Qin Jin, Luo Si, Qixiu Hu:
A high-performance text-independent speaker identification system based on BCDM. - Hiroshi Kido, Hideki Kasuya:
Representation of voice quality features associated with talker individuality. - Ji-Hwan Kim, Gil-Jin Jang, Seong-Jin Yun, Yung-Hwan Oh:
Candidate selection based on significance testing and its use in normalisation and scoring. - Yuko Kinoshita:
Japanese forensic phonetics: non-contemporaneous within-speaker variation in natural and read-out speech. - Filipp Korkmazskiy, Biing-Hwang Juang:
Statistical modeling of pronunciation and production variations for speech recognition. - Arne Kjell Foldvik, Knut Kvale:
Dialect maps and dialect research; useful tools for automatic speech recognition? - Youn-Jeong Kyung, Hwang-Soo Lee:
Text independent speaker recognition using micro-prosody. - Yoik Cheng, Hong C. Leung:
Speaker verification using fundamental frequency. - Weijie Liu, Toshihiro Isobe, Naoki Mukawa:
On optimum normalization method used for speaker verification. - Harvey Lloyd-Thomas, Eluned S. Parris, Jeremy H. Wright:
Recurrent substrings and data fusion for language recognition. - Konstantin P. Markov, Seiichi Nakagawa:
Text-independent speaker recognition using multiple information sources. - Konstantin P. Markov, Seiichi Nakagawa:
Discriminative training of GMM using a modified EM algorithm for speaker recognition. - Driss Matrouf, Martine Adda-Decker, Lori Lamel, Jean-Luc Gauvain:
Language identification incorporating lexical information. - Enric Monte, Ramon Arqué, Xavier Miró:
A VQ based speaker recognition system based in histogram distances. text independent and for noisy environments. - Asunción Moreno, José B. Mariño:
Spanish dialects: phonetic transcription. - Mieko Muramatsu:
Acoustic analysis of Japanese English prosody: comparison between fukushima dialect speakers and tokyo dialect speakers in declarative sentences and yes-no questions. - Hideki Noda, Katsuya Harada, Eiji Kawaguchi, Hidefumi Sawai:
A context-dependent approach for speaker verification using sequential decision. - Javier Ortega-Garcia, Santiago Cruz-Llanas, Joaquin Gonzalez-Rodriguez:
Quantitative influence of speech variability factors for automatic speaker verification in forensic tasks. - Thilo Pfau, Günther Ruske:
Creating hidden Markov models for fast speech. - Tuan D. Pham, Michael Wagner:
Speaker identification using relaxation labeling. - Leandro Rodríguez Liñares, Carmen García-Mateo:
A novel technique for the combination of utterance and speaker verification systems in a text-dependent speaker verification task. - Phil Rose:
A forensic phonetic investigation into non-contemporaneous variation in the f-pattern of similar-sounding speakers. - Astrid Schmidt-Nielsen, Thomas H. Crystal:
Human vs. machine speaker identification with telephone speech. - Stefan Slomka, Sridha Sridharan, Vinod Chandran:
A comparison of fusion techniques in mel-cepstral based speaker identification. - Hagen Soltau, Alex Waibel:
On the influence of hyperarticulated speech on recognition performance. - Nuala C. Ward, Dominik R. Dersch:
Text-independent speaker identification and verification using the TIMIT database. - Lisa Yanguas, Gerald C. O'Leary, Marc A. Zissman:
Incorporating linguistic knowledge into automatic dialect identification of Spanish. - Yiying Zhang, Xiaoyan Zhu:
A novel text-independent speaker verification method using the global speaker model. - Aaron E. Rosenberg, Ivan Magrin-Chagnolleau, Sarangarajan Parthasarathy, Qian Huang:
Speaker detection in broadcast speech databases. - Eluned S. Parris, Michael J. Carey:
Multilateral techniques for speaker recognition. - Masafumi Nishida, Yasuo Ariki:
Real time speaker indexing based on subspace method - application to TV news articles and debate. - George R. Doddington, Walter Liggett, Alvin F. Martin, Mark A. Przybocki, Douglas A. Reynolds:
SHEEP, GOATS, LAMBS and WOLVES: a statistical analysis of speaker performance in the NIST 1998 speaker recognition evaluation. - Andrés Corrada-Emmanuel, Michael Newman, Barbara Peskin, Larry Gillick, Robert Roth:
Progress in speaker recognition at dragon systems. - Tomas Nordström, Håkan Melin, Johan Lindberg:
A comparative study of speaker verification systems using the polycost database. - Tomoko Matsui, Kiyoaki Aikawa:
Robust speaker verification insensitive to session-dependent utterance variation and handset-dependent distortion. - Håkan Melin, Johan Koolwaaij, Johan Lindberg, Frédéric Bimbot:
A comparative evaluation of variance flooring techniques in HMM-based speaker verification. - Dijana Petrovska-Delacrétaz, Jan Cernocký, Jean Hennebert, Gérard Chollet:
Text-independent speaker verification using automatically labelled acoustic segments. - Qi Li:
A fast decoding algorithm based on sequential detection of the changes in distribution. - Jesper Østergaard Olsen:
Speaker verification with ensemble classifiers based on linear speech transforms. - Jesper Østergaard Olsen:
Speaker recognition based on discriminative projection models. - James Moody, Stefan Slomka, Jason W. Pelecanos, Sridha Sridharan:
On the convergence of Gaussian mixture models: improvements through vector quantization. - M. Kemal Sönmez, Elizabeth Shriberg, Larry P. Heck, Mitchel Weintraub:
Modeling dynamic prosodic variation for speaker verification. - Douglas A. Reynolds, Elliot Singer, Beth A. Carlson, Gerald C. O'Leary, Jack McLaughlin, Marc A. Zissman:
Blind clustering of speech utterances based on speaker and language characteristics. - Diamantino Caseiro, Isabel Trancoso:
Spoken language identification using the speechdat corpus. - Jerome Braun, Haim Levkowitz:
Automatic language identification with perceptually guided training and recurrent neural networks. - Sarel van Vuuren, Hynek Hermansky:
On the importance of components of the modulation spectrum for speaker verification.
Multimodal Spoken Language Processing 1-3
- Andrew P. Breen, O. Gloaguen, P. Stern:
A fast method of producing talking head mouth shapes from real speech. - Philip R. Cohen, Michael Johnston, David McGee, Sharon L. Oviatt, Josh Clow, Ira A. Smith:
The efficiency of multimodal interaction: a case study. - László Czap:
Audio and audio-visual perception of consonants disturbed by white noise and 'cocktail party'. - Simon Downey, Andrew P. Breen, Maria Fernández, Edward Kaneen:
Overview of the maya spoken language system. - Mauro Cettolo, Daniele Falavigna:
Automatic recognition of spontaneous speech dialogues. - Georg Fries, Stefan Feldes, Alfred Corbet:
Using an animated talking character in a web-based city guide demonstrator. - Rika Kanzaki, Takashi Kato:
Influence of facial views on the mcgurk effect in auditory noise. - Tom Brøndsted, Lars Bo Larsen, Michael Manthey, Paul McKevitt, Thomas B. Moeslund, Kristian G. Olesen:
The intellimedia workbench - a generic environment for multimodal systems. - Josh Clow, Sharon L. Oviatt:
STAMP: a suite of tools for analyzing multimodal system processing. - Sumi Shigeno:
Cultural similarities and differences in the recognition of audio-visual speech stimuli. - Toshiyuki Takezawa, Tsuyoshi Morimoto:
A multimodal-input multimedia-output guidance system: MMGS. - Oscar Vanegas, Akiji Tanaka, Keiichi Tokuda, Tadashi Kitamura:
HMM-based visual speech recognition using intensity and location normalization. - Yanjun Xu, Limin Du, Guoqiang Li, Ziqiang Hou:
A hierarchy probability-based visual features extraction method for speechreading. - Jörn Ostermann, Mark C. Beutnagel, Ariel Fischer, Yao Wang:
Integration of talking heads and text-to-speech synthesizers for visual TTS. - Levent M. Arslan, David Talkin:
Speech driven 3-d face point trajectory synthesis algorithm. - Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano:
Speech-to-lip movement synthesis based on the EM algorithm using audio-visual HMMs. - Deb Roy, Alex Pentland:
Learning words from natural audio-visual input. - Stéphane Dupont, Juergen Luettin:
Using the multi-stream approach for continuous audio-visual speech recognition: experiments on the M2VTS database. - Sharon L. Oviatt, Karen Kuhn:
Referential features and linguistic indirection in multimodal language. - Michael Johnston:
Multimodal language processing. - Jun-ichi Hirasawa, Noboru Miyazaki, Mikio Nakano, Takeshi Kawabata:
Implementation of coordinative nodding behavior on spoken dialogue systems. - Masao Yokoyama, Kazumi Aoyama, Hideaki Kikuchi, Katsuhiko Shirai:
Use of non-verbal information in communication between human and robot. - Steve Whittaker, John Choi, Julia Hirschberg, Christine H. Nakatani:
What you see is (almost) what you hear: design principles for user interfaces for accessing speech archives.
Isolated Word Recognition
- Daniel Azzopardi, Shahram Semnani, Ben Milner, Richard Wiseman:
Improving accuracy of telephony-based, speaker-independent speech recognition. - Aruna Bayya:
Rejection in speech recognition systems with limited training. - Ruxin Chen, Miyuki Tanaka, Duanpei Wu, Lex Olorenshaw, Mariscela Amador:
A four layer sharing HMM system for very large vocabulary isolated word recognition. - Rathinavelu Chengalvarayan:
A comparative study of hybrid modelling techniques for improved telephone speech recognition. - Jae-Seung Choi, Jong-Seok Lee, Hee-Youn Lee:
Smoothing and tying for Korean flexible vocabulary isolated word recognition. - Javier Ferreiros, Javier Macías Guarasa, Ascensión Gallardo-Antolín, José Colás, Ricardo de Córdoba, José Manuel Pardo, Luis Villarrubia Grande:
Recent work on a preselection module for a flexible large vocabulary speech recognition system in telephone environment. - Masakatsu Hoshimi, Maki Yamada, Katsuyuki Niyada, Shozo Makino:
A study of noise robustness for speaker independent speech recognition method using phoneme similarity vector. - Fran H. L. Jian:
Classification of taiwanese tones based on pitch and energy movements. - Finn Tore Johansen:
Phoneme-based recognition for the norwegian speechdat(II) database. - Montri Karnjanadecha, Stephen A. Zahorian:
Robust feature extraction for alphabet recognition. - Hisashi Kawai, Norio Higuchi:
Recognition of connected digit speech in Japanese collected over the telephone network. - Takuya Koizumi, Shuji Taniguchi, Kazuhiro Kohtoh:
Improving the speaker-dependency of subword-unit-based isolated word recognition. - Tomohiro Konuma, Tetsu Suzuki, Maki Yamada, Yoshio Ono, Masakatsu Hoshimi, Katsuyuki Niyada:
Speaker independent speech recognition method using constrained time alignment near phoneme discriminative frame. - Ki Yong Lee, Joohun Lee:
A nonstationary autoregressive HMM with gain adaptation for speech recognition. - Ren-Yuan Lyu, Yuang-jin Chiang, Wen-ping Hsieh:
A large-vocabulary taiwanese (MIN-NAN) multi-syllabic word recognition system based upon right-context-dependent phones with state clustering by acoustic decision tree. - Kazuyo Tanaka, Hiroaki Kojima:
Speech recognition based on the distance calculation between intermediate phonetic code sequences in symbolic domain. - York Chung-Ho Yang, June-Jei Kuo:
High accuracy Chinese speech recognition approach with Chinese input technology for telecommunication use.
Robust Speech Processing in Adverse Environments 1-5
- William J. J. Roberts, Yariv Ephraim:
Robust speech recognition using HMM's with toeplitz state covariance matrices. - David P. Thambiratnam, Sridha Sridharan:
Modeling of output probability distribution to improve small vocabulary speech recognition in adverse environments. - Philippe Morin, Ted H. Applebaum, Robert Boman, Yi Zhao, Jean-Claude Junqua:
Robust and compact multilingual word recognizers using features extracted from a phoneme similarity front-end. - Takeshi Yamada, Satoshi Nakamura, Kiyohiro Shikano:
An effect of adaptive beamforming on hands-free speech recognition based on 3-d viterbi search. - Joaquin Gonzalez-Rodriguez, Santiago Cruz-Llanas, Javier Ortega-Garcia:
Coherence-based subband decomposition for robust speech and speaker recognition in noisy and reverberant rooms. - Hui Jiang, Keikichi Hirose, Qiang Huo:
A minimax search algorithm for CDHMM based robust continuous speech recognition. - Su-Lin Wu, Brian Kingsbury, Nelson Morgan, Steven Greenberg:
Performance improvements through combining phone- and syllable-scale information in automatic speech recognition. - Arun C. Surendran, Chin-Hui Lee:
Predictive adaptation and compensation for robust speech recognition. - Jean-Claude Junqua, Steven Fincke, Kenneth L. Field:
Influence of the speaking style and the noise spectral tilt on the lombard reflex and automatic speech recognition. - Stefano Crafa, Luciano Fissore, Claudio Vair:
Data-driven PMC and Bayesian learning integration for fast model adaptation in noisy conditions. - Martin Hunke, Meeran Hyun, Steve Love, Thomas Holton:
Improving the noise and spectral robustness of an isolated-word recognizer using an auditory-model front end. - Owen P. Kenny, Douglas J. Nelson:
A model for speech reverberation and intelligibility restoring filters. - Guojun Zhou, John H. L. Hansen, James F. Kaiser:
Linear and nonlinear speech feature analysis for stress classification. - Sahar E. Bou-Ghazale, John H. L. Hansen:
Speech feature modeling for robust stressed speech recognition. - Katrin Kirchhoff:
Combining articulatory and acoustic information for speech recognition in noisy and reverberant environments. - Tim Wark, Sridha Sridharan:
Improving speaker identification performance in reverberant conditions using lip information. - Masato Akagi, Mamoru Iwaki, Noriyoshi Sakaguchi:
Spectral sequence compensation based on continuity of spectral sequence. - Aruna Bayya, B. Yegnanarayana:
Robust features for speech recognition systems. - Frédéric Berthommier, Hervé Glotin, Emmanuel Tessier, Hervé Bourlard:
Interfacing of CASA and partial recognition based on a multistream technique. - Sen-Chia Chang, Shih-Chieh Chien, Chih-Chung Kuo:
AN RNN-based compensation method for Mandarin telephone speech recognition. - Stephen M. Chu, Yunxin Zhao:
Robust speech recognition using discriminative stream weighting and parameter interpolation. - Johan de Veth, Bert Cranen, Lou Boves:
Acoustic backing-off in the local distance computation for robust automatic speech recognition. - Laura Docío Fernández, Carmen García-Mateo:
Noise model selection for robust speech recognition. - Simon Doclo, Ioannis Dologlou, Marc Moonen:
A novel iterative signal enhancement algorithm for noise reduction in speech. - Stéphane Dupont:
Missing data reconstruction for robust automatic speech recognition in the framework of hybrid HMM/ANN systems. - Ascensión Gallardo-Antolín, Fernando Díaz-de-María, Francisco J. Valverde-Albacete:
Recognition from GSM digital speech. - Petra Geutner, Matthias Denecke, Uwe Meier, Martin Westphal, Alex Waibel:
Conversational speech systems for on-board car navigation and assistance. - Laurent Girin, Laurent Varin, Gang Feng, Jean-Luc Schwartz:
A signal processing system for having the sound "pop-out" in noise thanks to the image of the speaker's lips: new advances using multi-layer perceptrons. - Ruhi Sarikaya, John H. L. Hansen:
Robust speech activity detection in the presence of noise. - Michel Héon, Hesham Tolba, Douglas D. O'Shaughnessy:
Robust automatic speech recognition by the application of a temporal-correlation-based recurrent multilayer neural network to the mel-based cepstral coefficients. - Juan M. Huerta, Richard M. Stern:
Speech recognition from GSM codec parameters. - Jeih-Weih Hung, Jia-Lin Shen, Lin-Shan Lee:
Improved parallel model combination based on better domain transformation for speech recognition under noisy environments. - Lamia Karray, Jean Monné:
Robust speech/non-speech detection in adverse conditions based on noise and speech statistics. - Myung Gyu Song, Hoi In Jung, Kab-Jong Shim, Hyung Soon Kim:
Speech recognition in car noise environments using multiple models according to noise masking levels. - Klaus Linhard, Tim Haulick:
Spectral noise subtraction with recursive gain curves. - Shengxi Pan, Jia Liu, Jintao Jiang, Zuoying Wang, Dajin Lu:
A novel robust speech recognition algorithm based on multi-models and integrated decision method. - Dusan Macho, Climent Nadeu:
On the interaction between time and frequency filtering of speech parameters for robust speech recognition. - Bhiksha Raj, Rita Singh, Richard M. Stern:
Inference of missing spectrographic features for robust speech recognition. - Volker Schless, Fritz Class:
SNR-dependent flooring and noise overestimation for joint application of spectral subtraction and model combination. - Jia-Lin Shen, Jeih-Weih Hung, Lin-Shan Lee:
Improved robust speech recognition considering signal correlation approximated by taylor series. - Won-Ho Shin, Weon-Goo Kim, Chungyong Lee, Il-Whan Cha:
Speech recognition in noisy environment using weighted projection-based likelihood measure. - Tetsuya Takiguchi, Satoshi Nakamura, Kiyohiro Shikano, Masatoshi Morishima, Toshihiro Isobe:
Evaluation of model adaptation by HMM decomposition on telephone speech recognition. - Hesham Tolba, Douglas D. O'Shaughnessy:
Comparative experiments to evaluate a voiced-unvoiced-based pre-processing approach to robust automatic speech recognition in low-SNR environments. - Masashi Unoki, Masato Akagi:
Signal extraction from noisy signal based on auditory scene analysis. - Tsuyoshi Usagawa, Kenji Sakai, Masanao Ebata:
Frequency domain binaural model as the front end of speech recognition system. - An-Tzyh Yu, Hsiao-Chuan Wang:
A study on the recognition of low bit-rate encoded speech. - Tai-Hwei Hwang, Hsiao-Chuan Wang:
Weighted parallel model combination for noisy speech recognition. - Daniel Woo:
Favourable and unfavourable short duration segments of speech in noise. - Piero Cosi, Stefano Pasquin, Enrico Zovato:
Auditory modeling techniques for robust pitch extraction and noise reduction. - Eliathamby Ambikairajah, Graham Tattersall, Andrew Davis:
Wavelet transform-based speech enhancement. - Beth Logan, Tony Robinson:
A practical perceptual frequency autoregressive HMM enhancement system. - John H. L. Hansen, Bryan L. Pellom:
An effective quality evaluation protocol for speech enhancement algorithms. - Jin-Nam Park, Tsuyoshi Usagawa, Masanao Ebata:
An adaptive beamforming microphone array system using a blind deconvolution. - Latchman Singh, Sridha Sridharan:
Speech enhancement using critical band spectral subtraction.
Articulatory Modelling 1-2
- Pierre Badin, Gérard Bailly, Monica Raybaudi, Christoph Segebarth:
A three-dimensional linear articulatory model based on MRI data. - Pascal Perrier, Yohan Payan, Joseph S. Perkell, Frédéric Jolly, Majid Zandipour, Melanie Matthies:
On loops and articulatory biomechanics. - Didier Demolin, Véronique Lecuit, Thierry Metens, Bruno Nazarian, Alain Soquet:
Magnetic resonance measurements of the velum port opening. - Masafumi Matsumura, Takuya Niikawa, Takao Tanabe, Takashi Tachimura, Takeshi Wada:
Cantilever-type force-sensor-mounted palatal plate for measuring palatolingual contact stress and pattern during speech phonation. - Tokihiko Kaburagi, Masaaki Honda:
Determination of the vocal tract spectrum from the articulatory movements based on the search of an articulatory-acoustic database. - Kiyoshi Honda, Mark Tiede:
An MRI study on the relationship between oral cavity shape and larynx position. - Frantz Clermont, Parham Mokhtari:
Acoustic-articulatory evaluation of the upper vowel-formant region and its presumed speaker-specific potency. - Philip Hoole, Christian Kroos:
Control of larynx height in vowel production. - Paavo Alku, Juha Vintturi, Erkki Vilkman:
Analyzing the effect of secondary excitations of the vocal tract on vocal intensity in different loudness conditions. - Gordon Ramsay:
An analysis of modal coupling effects during the glottal cycle: formant synthesizers from time-domain finite-difference simulations. - John H. Esling:
Laryngoscopic analysis of pharyngeal articulations and larynx-height voice quality settings. - Hiroki Matsuzaki, Kunitoshi Motoki, Nobuhiro Miki:
Effects of shapes of radiational aperture on radiation characteristics. - Jonathan Harrington, Mary E. Beckman, Janet Fletcher, Sallyanne Palethorpe:
An electropalatographic, kinematic, and acoustic analysis of supralaryngeal correlates of word-level prominence contrasts in English. - Marija Tabain:
Consistencies and inconsistencies between EPG and locus equation data on coarticulation. - Gérard Bailly, Pierre Badin, Anne Vilain:
Synergy between jaw and lips/tongue movements : consequences in articulatory modelling. - Philip Hoole:
Modelling tongue configuration in German vowel production. - Alan Wrench, Alan D. McIntosh, Colin Watson, William J. Hardcastle:
Optopalatograph: real-time feedback of tongue movement in 3D. - Yohann Meynadier, Michel Pitermann, Alain Marchal:
Effects of contrastive focal accent on linguopalatal articulation and coarticulation in the French [kskl] cluster.
Talking to Infants, Pets and Lovers
- Christine Kitamura, Denis Burnham:
Acoustic and affective qualities of IDS in English. - Sudaporn Luksaneeyanawin, Chayada Thanavisuth, Suthasinee Sittigasorn, Onwadee Rukkarangsarit:
Pragmatic characteristics of infant directed speech. - Denis Burnham, Elizabeth Francis, Ute Vollmer-Conna, Christine Kitamura, Vicky Averkiou, Amanda Olley, Mary Nguyen, Cal Paterson:
Are you my little pussy-cat? acoustic, phonetic and affective qualities of infant- and pet-directed speech. - Denis Burnham:
Special speech registers: talking to australian and Thai infants, and to pets.
Speech Coding 1-3
- Takashi Masuko, Keiichi Tokuda, Takao Kobayashi:
A very low bit rate speech coder using HMM with speaker adaptation. - Erik Ekudden, Roar Hagen, Björn Johansson, Shinji Hayashi, Akitoshi Kataoka, Sachiko Kurihara:
ITU-t g.729 extension at 6.4 kbps. - Damith J. Mudugamuwa, Alan B. Bradley:
Adaptive transformation for segmented parametric speech coding. - Julien Epps, W. Harvey Holmes:
Speech enhancement using STC-based bandwidth extension. - Weihua Zhang, W. Harvey Holmes:
Performance and optimization of the SEEVOC algorithm. - Wendy J. Holmes:
Towards a unified model for low bit-rate speech coding using a recognition-synthesis approach. - Jan Skoglund, W. Bastiaan Kleijn:
On the significance of temporal masking in speech coding. - W. Bastiaan Kleijn, Huimin Yang, Ed F. Deprettere:
Waveform interpolation coding with pitch-spaced subbands. - Nicola R. Chong, Ian S. Burnett, Joe F. Chicharo:
An improved decomposition method for WI using IIR wavelet filter banks. - Paavo Alku, Susanna Varho:
A new linear predictive method for compression of speech signals. - Shahrokh Ghaemmaghami, Mohamed A. Deriche, Sridha Sridharan:
Hierarchical temporal decomposition: a novel approach to efficient compression of spectral characteristics of speech. - Susan L. Hura:
Speech intelligibility testing for new technologies. - Sung-Joo Kim, Sangho Lee, Woo-Jin Han, Yung-Hwan Oh:
Efficient quantization of LSF parameters based on temporal decomposition. - Minoru Kohata:
A sinusoidal harmonic vocoder at 1.2 kbps using auditory perceptual characteristics. - Kazuhito Koishida, Gou Hirabayashi, Keiichi Tokuda, Takao Kobayashi:
A 16 kbit/s wideband CELP coder using MEL-generalized cepstral analysis and its subjective evaluation. - Derek J. Molyneux, C. I. Parris, Xiaoqin Sun, Barry M. G. Cheetham:
Comparison of spectral estimation techniques for low bit-rate speech coding. - Yoshihisa Nakatoh, Takeshi Norimatsu, Ah Heng Low, Hiroshi Matsumoto:
Low bit rate coding for speech and audio using mel linear predictive coding (MLPC) analysis. - Jeng-Shyang Pan, Chin-Shiuh Shieh, Shu-Chuan Chu:
Comparison study on VQ codevector index assignment. - John J. Parry, Ian S. Burnett, Joe F. Chicharo:
Using linguistic knowledge to improve the design of low-bit rate LSF quantisation. - Davor Petrinovic:
Transform coding of LSF parameters using wavelets. - Fabrice Plante, Barry M. G. Cheetham, David F. Marston, P. A. Barrett:
Source controlled variable bit-rate speech coder based on waveform interpolation. - Carlos M. Ribeiro, Isabel Trancoso:
Improving speaker recognisability in phonetic vocoders.
Neural Networks, Fuzzy and Evolutionary Methods 1
- Visarut Ahkuputra, Somchai Jitapunkul, Nutthacha Jittiwarangkul, Ekkarit Maneenoi, Sawit Kasuriya:
A comparison of Thai speech recognition systems using hidden Markov model, neural network, and fuzzy-neural network. - Felix Freitag, Enric Monte:
Phoneme recognition with statistical modeling of the prediction error of neural networks. - Toshiaki Fukada, Takayoshi Yoshimura, Yoshinori Sagisaka:
Neural network based pronunciation modeling with applications to speech recognition. - Stephen J. Haskey, Sekharajit Datta:
A comparative study of OCON and MLP architectures for phoneme recognition. - John-Paul Hosom, Ronald A. Cole, Piero Cosi:
Evaluation and integration of neural-network training techniques for continuous digit recognition. - Ying Jia, Limin Du, Ziqiang Hou:
Hierarchical neural networks (HNN) for Chinese continuous speech recognition. - Eric Keller:
Neural network motivation for segmental distribution. - Nikki Mirghafori, Nelson Morgan:
Combining connectionist multi-band and full-band probability streams for speech recognition of natural numbers. - Ednaldo Brigante Pizzolato, T. Jeff Reynolds:
Initial speech recognition results using the multinet architecture. - Tomio Takara, Yasushi Iha, Itaru Nagayama:
Selection of the optimal structure of the continuous HMM using the genetic algorithm. - Dat Tran, Michael Wagner, Tu Van Le:
A proposed decision rule for speaker recognition based on fuzzy c-means clustering. - Dat Tran, Tu Van Le, Michael Wagner:
Fuzzy Gaussian mixture models for speaker recognition. - Chai Wutiwiwatchai, Somchai Jitapunkul, Visarut Ahkuputra, Ekkarit Maneenoi, Sudaporn Luksaneeyanawin:
A new strategy of fuzzy-neural network for Thai numeral speech recognition. - Chai Wutiwiwatchai, Somchai Jitapunkul, Visarut Ahkuputra, Ekkarit Maneenoi, Sudaporn Luksaneeyanawin:
Thai polysyllabic word recognition using fuzzy-neural network. - Axel Glaeser:
Modular neural networks for low-complex phoneme recognition. - João F. G. de Freitas, Sue E. Johnson, Mahesan Niranjan, Andrew H. Gee:
Global optimisation of neural network models via sequential sampling-importance resampling. - Jörg Rottland, Andre Ludecke, Gerhard Rigoll:
Efficient computation of MMI neural networks for large vocabulary speech recognition systems. - Sid-Ahmed Selouani, Jean Caelen:
Modular connectionist systems for identifying complex arabic phonetic features. - Tuan D. Pham, Michael Wagner:
Fuzzy-integration based normalization for speaker verification. - Hiroshi Shimodaira, Jun Rokui, Mitsuru Nakai:
Improving the generalization performance of the MCE/GPD learning. - Tetsuro Kitazoe, Tomoyuki Ichiki, Sung-Ill Kim:
Acoustic speech recognition model by neural net equation with competition and cooperation. - Julie Ngan, Aravind Ganapathiraju, Joseph Picone:
Improved surname pronunciations using decision trees.
Utterance Verification and Word Spotting 1 / Speaker Adaptation 1
- M. Carmen Benítez, Antonio J. Rubio, Pedro García-Teodoro, Jesús Esteban Díaz Verdejo:
Word verification using confidence measures in speech recognition. - Giulia Bernardis, Hervé Bourlard:
Improving posterior based confidence measures in hybrid HMM/ANN speech recognition systems. - Javier Caminero, Eduardo López, Luis A. Hernández Gómez:
Two-pass utterance verification algorithm for long natural numbers recognition. - Berlin Chen, Hsin-Min Wang, Lee-Feng Chien, Lin-Shan Lee:
A*-admissible key-phrase spotting with sub-syllable level utterance verification. - Volker Fischer, Yuqing Gao, Eric Janke:
Speaker-independent upfront dialect adaptation in a large vocabulary continuous speech recognizer. - Asela Gunawardana, Hsiao-Wuen Hon, Li Jiang:
Word-based acoustic confidence measures for large-vocabulary speech recognition. - Sunil K. Gupta, Frank K. Soong:
Improved utterance rejection using length dependent thresholds. - Ching-Hsiang Ho, Saeed Vaseghi, Aimin Chen:
Bayesian constrained frequency warping HMMS for speaker normalisation. - Masaki Ida, Ryuji Yamasaki:
An evaluation of keyword spotting performance utilizing false alarm rejection based on prosodic information. - Dieu Tran, Ken-ichi Iso:
Predictive speaker adaptation and its prior training. - Rachida El Méliani, Douglas D. O'Shaughnessy:
Powerful syllabic fillers for general-task keyword-spotting and unlimited-vocabulary continuous-speech recognition. - Christine Pao, Philipp Schmid, James R. Glass:
Confidence scoring for speech understanding systems. - Bhuvana Ramabhadran, Abraham Ittycheriah:
Phonological rules for enhancing acoustic enrollment of unknown words. - Anand R. Setlur, Rafid A. Sukkar:
Recognition-based word counting for reliable barge-in and early endpoint detection in continuous speech recognition. - Martin Westphal, Tanja Schultz, Alex Waibel:
Linear discriminant - a new criterion for speaker normalization. - Gethin Williams, Steve Renals:
Confidence measures derived from an acceptor HMM. - Chung-Hsien Wu, Yeou-Jiunn Chen, Yu-Chun Hung:
Telephone speech multi-keyword spotting using fuzzy search algorithm and prosodic verification. - Yoichi Yamashita, Toshikatsu Tsunekawa, Riichiro Mizoguchi:
Topic recognition for news speech based on keyword spotting.
Human Speech Perception 1-4
- Sieb G. Nooteboom, Meinou van Dijk:
Heads and tails in word perception: evidence for 'early-to-late' processing in listening and reading. - Saskia te Riele, Hugo Quené:
Evidence for early effects of sentence context on word segmentation. - Hugo Quené, Maya van Rossum, Mieke van Wijck:
Assimilation and anticipation in word perception. - M. Louise Kelly, Ellen Gurman Bard, Catherine Sotillo:
Lexical activation by assimilated and reduced tokens. - Masato Akagi, Mamoru Iwaki, Tomoya Minakawa:
Fundamental frequency fluctuation in continuous vowel utterance and its perception. - Shigeaki Amano, Tadahisa Kondo:
Estimation of mental lexicon size with word familiarity database. - Matthew P. Aylett, Alice Turk:
Vowel quality in spontaneous speech: what makes a good vowel? - Adrian Neagu, Gérard Bailly:
Cooperation and competition of burst and formant transitions for the perception and identification of French stops. - Anne Bonneau, Yves Laprie:
The effect of modifying formant amplitudes on the perception of French vowels generated by copy synthesis. - Hsuan-Chih Chen, Michael C. W. Yip, Sum-Yin Wong:
Segmental and tonal processing in Cantonese. - Michael C. W. Yip, Po-Yee Leung, Hsuan-Chih Chen:
Phonological similarity effects in Cantonese spoken-word processing. - Robert I. Damper, Steve R. Gunn:
On the learnability of the voicing contrast for initial stops. - Loredana Cerrato, Mauro Falcone:
Acoustic and perceptual characteristic of Italian stop consonants. - Santiago Fernández, Sergio Feijóo, Ramón Balsa, Nieves Barros:
Acoustic cues for the auditory identification of the Spanish fricative /f/. - Santiago Fernández, Sergio Feijóo, Ramón Balsa, Nieves Barros:
Recognition of vowels in fricative context. - Santiago Fernández, Sergio Feijóo, Plinio Almeida:
Voicing affects perceived manner of articulation. - Valérie Hazan, Andrew Simpson, Mark A. Huckvale:
Enhancement techniques to improve the intelligibility of consonants in noise : speaker and listener effects. - Fran H. L. Jian:
Boundaries of perception of long tones in taiwanese speech. - Hiroaki Kato, Minoru Tsuzaki, Yoshinori Sagisaka:
Effects of phonetic quality and duration on perceptual acceptability of temporal changes in speech. - Michael Kiefte, Terrance M. Nearey:
Dynamic vs. static spectral detail in the perception of gated stops. - Takashi Otake, Kiyoko Yoneyama:
Phonological units in speech segmentation and phonological awareness. - Elizabeth Shriberg, Andreas Stolcke:
How far do speakers back up in repairs? a quantitatve model. - Karsten Steinhauer, Kai Alter, Angela D. Friederici:
Don't blame it (all) on the pause: further ERP evidence for a prosody-induced garden-path in running speech. - Jean Vroomen, Béatrice de Gelder:
The role of stress for lexical selection in dutch. - Jyrki Tuomainen, Jean Vroomen, Béatrice de Gelder:
The perception of stressed syllables in finnish. - Kimiko Yamakawa, Ryoji Baba:
The perception of the morae with devocalized vowels in Japanese language. - Dominic W. Massaro:
Categorical perception: important phenomenon or lasting myth? - Ellen Gerrits, Bert Schouten:
Categorical perception of vowels. - Kazuhiko Kakehi, Yuki Hirose:
Suprasegmental cues for the segmentation of identical vowel sequences in Japanese. - William A. Ainsworth:
Perception of concurrent approximant-vowel syllables. - Dawn M. Behne, Peter E. Czigler, Kirk P. H. Sullivan:
Perceived Swedish vowel quantity: effects of postvocalic consonant duration. - Anne Cutler, Rebecca Treiman, Brit van Ooijen:
Orthografik inkoncistensy ephekts in foneme detektion? - Bruce L. Derwing, Terrance M. Nearey, Yeo Bom Yoon:
The effect of orthographic knowledge on the segmentation of speech. - James M. McQueen, Anne Cutler:
Spotting (different types of) words in (different types of) context. - Manjari Ohala, John J. Ohala:
Correlation between consonantal VC transitions and degree of perceptual confusion of place contrast in hindi. - David House, Dik J. Hermes, Frédéric Beaugendre:
Perception of tonal rises and falls for accentuation and phrasing in Swedish. - Steven Greenberg, Takayuki Arai, Rosaria Silipo:
Speech intelligibility derived from exceedingly sparse spectral information.
Speech and Hearing Disorders 1
- Mark C. Flynn, Richard C. Dowell, Graeme M. Clark:
Adults with a severe-to-profound hearing impairment. investigating the effects of linguistic context on speech perception. - Florien J. Koopmans-van Beinum, Caroline E. Schwippert, Cecile T. L. Kuijpers:
Speech perception in dyslexia: measurements from birth onwards. - Karen Croot:
An acoustic analysis of vowel production across tasks in a case of non-fluent progressive aphasia. - Jan van Doorn, Sharynne McLeod, Elise Baker, Alison Purcell, William Thorpe:
Speech technology in clinical environments.
Spoken Language Understanding Systems 1-4
- Stephanie Seneff, Edward Hurley, Raymond Lau, Christine Pao, Philipp Schmid, Victor Zue:
GALAXY-II: a reference architecture for conversational system development. - Grace Chung, Stephanie Seneff:
Improvements in speech understanding accuracy through the integration of hierarchical linguistic, prosodic, and phonological constraints in the jupiter domain. - Kenney Ng:
Towards robust methods for spoken document retrieval. - Richard Sproat, Jan P. H. van Santen:
Automatic ambiguity detection. - Julia Fischer, Jürgen Haas, Elmar Nöth, Heinrich Niemann, Frank Deinzer:
Empowering knowledge based speech understanding through statistics. - Akito Nagai, Yasushi Ishikawa:
Concept-driven speech understanding incorporated with a statistic language model. - José Colás, Javier Ferreiros, Juan Manuel Montero, Julio Pastor, Ascensión Gallardo-Antolín, José Manuel Pardo:
On the limitations of stochastic conceptual finite-state language models for speech understanding. - Todd Ward, Salim Roukos, Chalapathy Neti, Jerome Gros, Mark Epstein, Satya Dharanipragada:
Towards speech understanding across multiple languages. - Andreas Stolcke, Elizabeth Shriberg, Rebecca A. Bates, Mari Ostendorf, Dilek Zeynep Hakkani, Madelaine Plauché, Gökhan Tür, Yu Lu:
Automatic detection of sentence boundaries and disfluencies based on recognized words. - Wolfgang Reichl, Bob Carpenter, Jennifer Chu-Carroll, Wu Chou:
Language modeling for content extraction in human-computer dialogues. - John Gillett, Wayne H. Ward:
A language model combining trigrams and stochastic context-free grammars. - Bernd Souvignier, Andreas Kellner:
Online adaptation of language models in spoken dialogue systems. - Giuseppe Riccardi, Alexandros Potamianos, Shrikanth S. Narayanan:
Language model adaptation for spoken language systems. - Brigitte Bigi, Renato de Mori, Marc El-Bèze, Thierry Spriet:
Detecting topic shifts using a cache memory. - Lori S. Levin, Ann E. Thymé-Gobbel, Alon Lavie, Klaus Ries, Klaus Zechner:
A discourse coding scheme for conversational Spanish. - Kazuhiro Arai, Jeremy H. Wright, Giuseppe Riccardi, Allen L. Gorin:
Grammar fragment acquisition using syntactic and semantic clustering. - Tom Brøndsted:
Non-expert access to unification based speech understanding. - Bob Carpenter, Jennifer Chu-Carroll:
Natural language call routing: a robust, self-organizing approach. - Debajit Ghosh, David Goddeau:
Automatic grammar induction from semantic parsing. - Yasuyuki Kono, Takehide Yano, Munehiko Sasajima:
BTH: an efficient parsing algorithm for word-spotting. - Susanne Kronenberg, Franz Kummert:
Syntax coordination: interaction of discourse and extrapositions. - Bor-Shen Lin, Berlin Chen, Hsin-Min Wang, Lin-Shan Lee:
Hierarchical tag-graph search for spontaneous speech understanding in spoken dialog systems. - Yasuhisa Niimi, Noboru Takinaga, Takuya Nishimoto:
Extraction of the dialog act and the topic from utterances in a spoken dialog system. - Harry Printz:
Fast computation of maximum entropy / minimum divergence feature gain. - Giuseppe Riccardi, Allen L. Gorin:
Stochastic language models for speech recognition and understanding. - Carol Van Ess-Dykema, Klaus Ries:
Linguistically engineered tools for speech recognition error analysis. - Kazuya Takeda, Atsunori Ogawa, Fumitada Itakura:
Estimating entropy of a language from optimal word insertion penalty. - Shu-Chuan Tseng:
A linguistic analysis of repair signals in co-operative spoken dialogues. - Francisco J. Valverde-Albacete, José Manuel Pardo:
A hierarchical language model for CSR. - Jeremy H. Wright, Allen L. Gorin, Alicia Abella:
Spoken language understanding within dialogs using a graphical model of task structure. - Yoshimi Suzuki, Fumiyo Fukumoto, Yoshihiro Sekiguchi:
Keyword extraction of radio news using domain identification based on categories of an encyclopedia.
Signal Processing and Speech Analysis 1-3
- James Droppo, Alex Acero:
Maximum a posteriori pitch tracking. - Dekun Yang, Georg F. Meyer, William A. Ainsworth:
Vowel separation using the reassigned amplitude-modulation spectrum. - Eloi Batlle, Climent Nadeu, José A. R. Fonollosa:
Feature decorrelation methods in speech recognition. a comparative study. - Marie-José Caraty, Claude Montacié:
Multi-resolution for speech analysis. - Steve Cassidy, Catherine I. Watson:
Dynamic features in children's vowels. - Johan de Veth, Lou Boves:
Effectiveness of phase-corrected rasta for continuous speech recognition. - Satya Dharanipragada, Ramesh A. Gopinath, Bhaskar D. Rao:
Techniques for capturing temporal variations in speech signals with fixed-rate processing. - Limin Du, Kenneth N. Stevens:
Automatic detection of landmark for nasal consonants from speech waveform. - Thierry Dutoit, Juergen Schroeter:
Plug and play software for designing high-level speech processing systems. - Alexandre Girardi, Kiyohiro Shikano, Satoshi Nakamura:
Creating speaker independent HMM models for restricted database using STRAIGHT-TEMPO morphing. - Laure Charonnat, Michel Guitton, Joel Crestel, Gerome Allée:
Restoration of hyperbaric speech by correction of the formants and the pitch. - Juana M. Gutiérrez-Arriola, Yung-Sheng Hsiao, Juan Manuel Montero, José Manuel Pardo, Donald G. Childers:
Voice conversion based on parameter transformation. - Jilei Tian, Ramalingam Hariharan, Kari Laurila:
Noise robust two-stream auditory feature extraction method for speech recognition. - Andrew K. Halberstadt, James R. Glass:
Heterogeneous measurements and multiple classifiers for speech recognition. - Naomi Harte, Saeed Vaseghi, Ben P. Milner:
Joint recognition and segmentation using phonetically derived features and a hybrid phoneme model. - Hynek Hermansky, Sangita Sharma:
TRAPS - classifiers of temporal patterns. - John N. Holmes:
Robust measurement of fundamental frequency and degree of voicing. - John F. Holzrichter, Gregory C. Burnett, Todd J. Gable, Lawrence C. Ng:
Micropower electro-magnetic sensors for speech characterization, recognition, verification, and other applications. - Jia-Lin Shen, Jeih-Weih Hung, Lin-Shan Lee:
Robust entropy-based endpoint detection for speech recognition in noisy environments. - Jia-Lin Shen, Wen-Liang Hwang:
Statistical integration of temporal filter banks for robust speech recognition using linear discriminant analysis (LDA). - Dorota J. Iskra, William H. Edmondson:
Feature-based approach to speech recognition. - Hiroyuki Kamata, Akira Kaneko, Yoshihisa Ishida:
Periodicity emphasis of voice wave using nonlinear IIR digital filters and its applications. - Simon King, Todd A. Stephenson, Stephen Isard, Paul Taylor, Alex Strachan:
Speech recognition via phonetically featured syllables. - Jacques C. Koreman, Bistra Andreeva, William J. Barry:
Do phonetic features help to improve consonant identification in ASR? - Hisao Kuwabara:
Perceptual and acoustic properties of phonemes in continuous speech for different speaking rate. - Joohun Lee, Ki Yong Lee:
On robust sequential estimator based on t-distribution with forgetting factor for speech analysis. - Christopher John Long, Sekharajit Datta:
Discriminant wavelet basis construction for speech recognition. - Hiroshi Matsumoto, Yoshihisa Nakatoh, Yoshinori Furuhata:
An efficient mel-LPC analysis method for speech recognition. - Philip McMahon, Paul M. McCourt, Saeed Vaseghi:
Discriminative weighting of multi-resolution sub-band cepstral features for speech recognition. - Yoram Meron, Keikichi Hirose:
Separation of singing and piano sounds. - Nobuaki Minematsu, Seiichi Nakagawa:
Modeling of variations in cepstral coefficients caused by F0 changes and its application to speech processing. - Partha Niyogi, Partha Mitra, Man Mohan Sondhi:
A detection framework for locating phonetic events. - Climent Nadeu, Félix Galindo, Jaume Padrell:
On frequency averaging for spectral analysis in speech recognition. - Munehiro Namba, Yoshihisa Ishida:
Wavelet transform domain blind equalization and its application to speech analysis. - Steve Pearson:
A novel method of formant analysis and glottal inverse filtering. - Antonio J. Araujo, Vitor C. Pera, Márcio N. de Souza:
Vector quantizer acceleration for an automatic speech recognition application. - Hartmut R. Pfitzinger:
Local speech rate as a combination of syllable and phone rate. - Solange Rossato, Gang Feng, Rafael Laboissière:
Recovering gestures from speech signals: a preliminary study for nasal vowels. - Günther Ruske, Robert Faltlhauser, Thilo Pfau:
Extended linear discriminant analysis (ELDA) for speech recognition. - Ara Samouelian, Jordi Robert-Ribes, Mike Plumpe:
Speech, silence, music and noise classification of TV broadcast material. - Jean Schoentgen, Alain Soquet, Véronique Lecuit, Sorin Ciocea:
The relation between vocal tract shape and formant frequencies can be described by means of a system of coupled differential equations. - Youngjoo Suh, Kyuwoong Hwang, Oh-Wook Kwon, Jun Park:
Improving speech recognizer by broader acoustic-phonetic group classification. - C. William Thorpe:
Separation of speech source and filter by time-domain deconvolution. - Hesham Tolba, Douglas D. O'Shaughnessy:
On the application of the AM-FM model for the recovery of missing frequency bands of telephone speech. - Chang-Sheng Yang, Hideki Kasuya:
Estimation of voice source and vocal tract parameters using combined subspace-based and amplitude spectrum-based algorithm. - Fang Zheng, Zhanjiang Song, Ling Li, Wenjian Yu, Fengzhou Zheng, Wenhu Wu:
The distance measure for line spectrum pairs applied to speech recognition. - William A. Ainsworth, Charles Robert Day, Georg F. Meyer:
Improving pitch estimation with short duration speech samples. - Hideki Kawahara, Alain de Cheveigné, Roy D. Patterson:
An instantaneous-frequency-based pitch extraction method for high-quality speech transformation: revised TEMPO in the STRAIGHT-suite. - Kiyoaki Aikawa:
Speaker-independent speech recognition using micro segment spectrum integration. - Keiichi Funaki, Yoshikazu Miyanaga, Koji Tochinai:
On robust speech analysis based on time-varying complex AR model. - Hynek Hermansky, Narendranath Malayath:
Spectral basis functions from discriminant analysis. - Shin Suzuki, Takeshi Okadome, Masaaki Honda:
Determination of articulatory positions from speech acoustics by applying dynamic articulatory constraints. - Yang Li, Yunxin Zhao:
Recognizing emotions in speech using short-term and long-term features. - Arnaud Robert, Jan Eriksson:
Periphear : a nonlinear active model of the auditory periphery. - Padma Ramesh, Partha Niyogi:
The voicing feature for stop consonants: acoustic phonetic analyses and automatic speech recognition experiments. - Sankar Basu, Stéphane H. Maes:
Wavelet-based energy binning cepstral features for automatic speech recognition. - Carlos Silva, Samir Chennoukh:
Articulatory analysis using a codebook for articulatory based low bit-rate speech coding.
Spoken Language Generation and Translation 1-2
- Fang Chen, Baozong Yuan:
The modeling and realization of natural speech generation system. - Robert Eklund:
"ko tok ples ensin bilong tok pisin" or the TP-CLE: a first report from a pilot speech-to-speech translation project from Swedish to tok pisin. - Ismael García-Varea, Francisco Casacuberta, Hermann Ney:
An iterative, DP-based search algorithm for statistical machine translation. - Barbara Gawronska, David House:
Information extraction and text generation of news reports for a Swedish-English bilingual spoken dialogue system. - Joris Hulstijn, Arjan van Hessen:
Utterance generation for transaction dialogues. - Kai Ishikawa, Eiichiro Sumita, Hitoshi Iida:
Example-based error recovery method for speech translation: repairing sub-trees according to the semantic distance. - Emiel Krahmer, Mariët Theune:
Context sensitive generation of descriptions. - Lori S. Levin, Donna Gates, Alon Lavie, Alex Waibel:
An interlingua based on domain actions for machine translation of task-oriented dialogues. - Sandra Williams:
Generating pitch accents in a concept-to-speech system using a knowledge base. - Tobias Ruland, C. J. Rupp, Jörg Spilker, Hans Weber, Karsten L. Worm:
Making the most of multiplicity: a multi-parser multi-strategy architecture for the robust processing of spoken language. - Jon R. W. Yi, James R. Glass:
Natural-sounding speech synthesis using variable-length units. - Esther Klabbers, Emiel Krahmer, Mariët Theune:
A generic algorithm for generating spoken monologues. - Janet Hitzeman, Alan W. Black, Paul Taylor, Chris Mellish, Jon Oberlander:
On the use of automatically generated discourse-level information in a concept-to-speech synthesis system. - Hiyan Alshawi, Srinivas Bangalore, Shona Douglas:
Learning phrase-based head transduction models for translation of spoken utterances. - Toshiaki Fukada, Detlef Koll, Alex Waibel, Kouichi Tanigaki:
Probabilistic dialogue act extraction for concept based multilingual translation systems. - Ye-Yi Wang, Alex Waibel:
Fast decoding for statistical machine translation. - Toshiyuki Takezawa, Tsuyoshi Morimoto, Yoshinori Sagisaka, Nick Campbell, Hitoshi Iida, Fumiaki Sugaya, Akio Yokoo, Seiichi Yamamoto:
A Japanese-to-English speech translation system: ATR-MATRIX.
Segmentation, Labelling and Speech Corpora 1-4
- Julia Hirschberg, Christine H. Nakatani:
Acoustic indicators of topic segmentation. - Esther Grabe, Francis Nolan, Kimberley J. Farrar:
IVie - a comparative transcription system for intonational variation in English. - Fu-Chiang Chou, Chiu-yu Tseng, Lin-Shan Lee:
Automatic segmental and prosodic labeling of Mandarin speech database. - Stefan Rapp:
Automatic labelling of German prosody. - Matti Karjalainen, Toomas Altosaar, Miikka Huttunen:
An efficient labeling tool for the Quicksig speech database. - Harry Bratt, Leonardo Neumeyer, Elizabeth Shriberg, Horacio Franco:
Collection and detailed transcription of a speech database for development of language learning technologies. - Neeraj Deshmukh, Aravind Ganapathiraju, Andi Gleeson, Jonathan Hamaker, Joseph Picone:
Resegmentation of SWITCHBOARD. - Demetrio Aiello, Cristina Delogu, Renato de Mori, Andrea Di Carlo, Marina Nisi, Silvia Tummeacciu:
Automatic generation of visual scenarios for spoken corpora acquisition. - Mauro Cettolo, Daniele Falavigna:
Automatic detection of semantic boundaries based on acoustic and lexical knowledge. - Iman Gholampour, Kambiz Nayebi:
A new fast algorithm for automatic segmentation of continuous speech. - Akemi Iida, Nick Campbell, Soichiro Iga, Fumito Higuchi, Michiaki Yasumura:
Acoustic nature and perceptual testing of corpora of emotional speech. - Pyungsu Kang, Jiyoung Kang, Jinyoung Kim:
Korean prosodic break index labelling by a new mixed method of LDA and VQ. - Mark R. Laws, Richard Kilgour:
MOOSE: management of otago speech environment. - Fabrice Malfrère, Olivier Deroo, Thierry Dutoit:
Phonetic alignment: speech synthesis based vs. hybrid HMM/ANN. - J. Bruce Millar:
Customisation and quality assessment of spoken language description. - Claude Montacié, Marie-José Caraty:
A silence/noise/music/speech splitting algorithm. - David Pye, Nicholas J. Hollinghurst, Timothy J. Mills, Kenneth R. Wood:
Audio-visual segmentation for content-based retrieval. - Stefan Rapp, Grzegorz Dogil:
Same news is good news: automatically collecting reoccurring radio news stories. - Christel Brindöpke, Brigitte Schaffranietz:
An annotation system for melodic aspects of German spontaneous speech. - Karlheinz Stöber, Wolfgang Hess:
Additional use of phoneme duration hypotheses in automatic speech segmentation. - Amy Isard, David McKelvie, Henry S. Thompson:
Towards a minimal standard for dialogue transcripts: a new SGML architecture for the HCRC map task corpus. - Pedro J. Moreno, Christopher F. Joerg, Jean-Manuel Van Thong, Oren Glickman:
A recursive algorithm for the forced alignment of very long audio segments. - Judith M. Kessens, Mirjam Wester, Catia Cucchiarini, Helmer Strik:
The selection of pronunciation variants: comparing the performance of man and machine. - Jon Barker, Gethin Williams, Steve Renals:
Acoustic confidence measures for segmenting broadcast news. - Bryan L. Pellom, John H. L. Hansen:
A duration-based confidence measure for automatic segmentation of noise corrupted speech. - Thomas Hain, Philip C. Woodland:
Segmentation and classification of broadcast news audio. - Børge Lindberg, Robrecht Comeyne, Christoph Draxler, Francesco Senia:
Speaker recruitment methods and speaker coverage - experiences from a large multilingual speech database collection. - Estelle Campione, Jean Véronis:
A multilingual prosodic database. - Ronald A. Cole, Mike Noel, Victoria Noel:
The CSLU speaker recognition corpus. - Gregory Aist, Peggy Chan, Xuedong Huang, Li Jiang, Rebecca Kennedy, DeWitt Latimer IV, Jack Mostow, Calvin Yeung:
How effective is unsupervised data collection for children's speech recognition? - Jyh-Shing Shyuu, Jhing-Fa Wang:
An algorithm for automatic generation of Mandarin phonetic balanced corpus. - Steven Bird, Mark Liberman:
Towards a formal framework for linguistic annotations. - Toomas Altosaar, Martti Vainio:
Forming generic models of speech for uniform database access.
Large Vocabulary Continuous Speech Recognition 1-6
- Gary D. Cook, Tony Robinson, James Christie:
Real-time recognition of broadcast news. - Ha-Jin Yu, Hoon Kim, Jae-Seung Choi, Joon-Mo Hong, Kew-Suh Park, Jong-Seok Lee, Hee-Youn Lee:
Automatic recognition of Korean broadcast news speech. - James R. Glass, Timothy J. Hazen:
Telephone-based conversational speech recognition in the JUPITER domain. - Hsiao-Wuen Hon, Yun-Cheng Ju, Keiko Otani:
Japanese large-vocabulary continuous speech recognition system based on microsoft whisper. - Jean-Luc Gauvain, Lori Lamel, Gilles Adda:
Partitioning and transcription of broadcast news data. - Hajime Tsukada, Hirofumi Yamamoto, Toshiyuki Takezawa, Yoshinori Sagisaka:
Grammatical word graph re-generation for spontaneous speech recognition. - Norimichi Yodo, Kiyohiro Shikano, Satoshi Nakamura:
Compression algorithm of trigram language models based on maximum likelihood estimation. - Ulla Uebler, Heinrich Niemann:
Morphological modeling of word classes for language models. - Imed Zitouni, Kamel Smaïli, Jean Paul Haton, Sabine Deligne, Frédéric Bimbot:
A comparative study between polyclass and multiclass language models. - Dietrich Klakow:
Log-linear interpolation of language models. - Philip Clarkson, Tony Robinson:
The applicability of adaptive language modelling for the broadcast news task. - Long Nguyen, Richard M. Schwartz:
The BBN single-phonetic-tree fast-match algorithm. - Akinobu Lee, Tatsuya Kawahara, Shuji Doshita:
An efficient two-pass search algorithm using word trellis index. - Mike Schuster:
Nozomi - a fast, memory-efficient stack decoder for LVCSR. - Thomas Kemp, Alex Waibel:
Reducing the OOV rate in broadcast news speech recognition. - Michiel Bacchiani, Mari Ostendorf:
Using automatically-derived acoustic sub-word units in large vocabulary speech recognition. - Don McAllaster, Lawrence Gillick, Francesco Scattone, Michael Newman:
Fabricating conversational speech data with acoustic models: a program to examine model-data mismatch. - Wu Chou, Wolfgang Reichl:
High resolution decision tree based acoustic modeling beyond CART. - Thomas Kemp, Alex Waibel:
Unsupervised training of a speech recognizer using TV broadcasts. - Clark Z. Lee, Douglas D. O'Shaughnessy:
A new method to achieve fast acoustic matching for speech recognition. - Jacques Duchateau, Kris Demuynck, Dirk Van Compernolle, Patrick Wambacq:
Improved parameter tying for efficient acoustic model evaluation in large vocabulary continuous speech recognition. - Ananth Sankar:
A new look at HMM parameter tying for large vocabulary speech recognition. - Ramesh A. Gopinath, Bhuvana Ramabhadran, Satya Dharanipragada:
Factor analysis invariant to linear transformations of data. - Akio Ando, Akio Kobayashi, Toru Imai:
A thesaurus-based statistical language model for broadcast news transcription. - Sreeram V. Balakrishnan:
Effect of task complexity on search strategies for the motorola lexicus continuous speech recognition system. - Dhananjay Bansal, Mosur K. Ravishankar:
New features for confidence annotation. - Jerome R. Bellegarda:
Multi-Span statistical language modeling for large vocabulary speech recognition. - Rathinavelu Chengalvarayan:
Maximum-likelihood updates of HMM duration parameters for discriminative continuous speech recognition. - Noah Coccaro, Daniel Jurafsky:
Towards better integration of semantic predictors in statistical language modeling. - Julio Pastor, José Colás, Rubén San Segundo, José Manuel Pardo:
An asymmetric stochastic language model based on multi-tagged words. - Vassilios Digalakis, Leonardo Neumeyer, Manolis Perakakis:
Product-code vector quantization of cepstral parameters for speech recognition over the WWW. - Bernard Doherty, Saeed Vaseghi, Paul M. McCourt:
Context dependent tree based transforms for phonetic speech recognition. - Michael T. Johnson, Mary P. Harper, Leah H. Jamieson:
Interfacing acoustic models with natural language processing systems. - Photina Jaeyun Jang, Alexander G. Hauptmann:
Hierarchical cluster language modeling with statistical rule extraction for rescoring n-best hypotheses during speech decoding. - Atsuhiko Kai, Yoshifumi Hirose, Seiichi Nakagawa:
Dealing with out-of-vocabulary words and speech disfluencies in an n-gram based speech understanding system. - Tetsunori Kobayashi, Yosuke Wada, Norihiko Kobayashi:
Source-extended language model for large vocabulary continuous speech recognition. - Akio Kobayashi, Kazuo Onoe, Toru Imai, Akio Ando:
Time dependent language model for broadcast news transcription and its post-correction. - Jacques C. Koreman, William J. Barry, Bistra Andreeva:
Exploiting transitions and focussing on linguistic properties for ASR. - Raymond Lau, Stephanie Seneff:
A unified framework for sublexical and linguistic modelling supporting flexible vocabulary speech understanding. - Lalit R. Bahl, Steven V. De Gennaro, Pieter de Souza, Edward A. Epstein, J. M. Le Roux, Burn L. Lewis, Claire Waast:
A method for modeling liaison in a speech recognition system for French. - Fu-Hua Liu, Michael Picheny:
On variable sampling frequencies in speech recognition. - Kristine W. Ma, George Zavaliagkos, Rukmini Iyer:
Pronunciation modeling for large vocabulary conversational speech recognition. - Sankar Basu, Abraham Ittycheriah, Stéphane H. Maes:
Time shift invariant speech recognition. - José B. Mariño, Pau Pachès-Leal, Albino Nogueiras:
The demiphone versus the triphone in a decision-tree state-tying framework. - Shinsuke Mori, Masafumi Nishimura, Nobuyasu Itoh:
Word clustering for a word bi-gram model. - João Paulo Neto, Ciro Martins, Luís B. Almeida:
A large vocabulary continuous speech recognition hybrid system for the portuguese language. - Mukund Padmanabhan, Bhuvana Ramabhadran, Sankar Basu:
Speech recognition performance on a new voicemail transcription task. - Sira E. Palazuelos, Santiago Aguilera, José Rodrigo, Juan Ignacio Godino-Llorente:
Grammatical and statistical word prediction system for Spanish integrated in an aid for people with disabilities. - Kishore Papineni, Satya Dharanipragada:
Segmentation using a maximum entropy approach. - Adam L. Berger, Harry Printz:
Recognition performance of a large-scale dependency grammar language model. - Ganesh N. Ramaswamy, Harry Printz, Ponani S. Gopalakrishnan:
A bootstrap technique for building domain-dependent language models. - Joan-Andreu Sánchez, José-Miguel Benedí:
Estimation of the probability distributions of stochastic context-free grammars from the k-best derivations. - Ananth Sankar:
Robust HMM estimation with Gaussian merging-splitting and tied-transform HMMs. - Kristie Seymore, Stanley F. Chen, Ronald Rosenfeld:
Nonlinear interpolation of topic models for language model adaptation. - Kazuyuki Takagi, Rei Oguro, Kenji Hashimoto, Kazuhiko Ozeki:
Performance evaluation of word phrase and noun category language models for broadcast news speech recognition. - Hesham Tolba, Douglas D. O'Shaughnessy:
Robust automatic continuous-speech recognition based on a voiced-unvoiced decision. - Juan Carlos Torrecilla, Ismael Cortázar, Luis A. Hernández Gómez:
Double tree beam search using hierarchical subword units. - Paul van Mulbregt, Ira Carp, Lawrence Gillick, Steve Lowe, Jon Yamron:
Text segmentation and topic tracking on broadcast news via a hidden Markov model approach. - Philip O'Neill, Saeed Vaseghi, Bernard Doherty, Wooi-Haw Tan, Paul M. McCourt:
Multi-phone strings as subword units for speech recognition. - Nanette Veilleux, Stefanie Shattuck-Hufnagel:
Phonetic modification of the syllable /tu/ in two spontaneous american English dialogues. - Fuliang Weng, Andreas Stolcke, Ananth Sankar:
Efficient lattice representation and generation. - Mirjam Wester, Judith M. Kessens, Helmer Strik:
Modeling pronunciation variation for a dutch CSR: testing three methods. - Edward W. D. Whittaker, Philip C. Woodland:
Comparison of language modelling techniques for Russian and English. - Petra Witschel:
Optimized POS-based language models for large vocabulary speech recognition. - Mark Wright, Simon Hovell, Simon Ringland:
Reducing peak search effort using two-tier pruning. - George Zavaliagkos, Man-Hung Siu, Thomas Colthurst, Jayadev Billa:
Using untranscribed training data to improve performance. - Ea-Ee Jan, Raimo Bakis, Fu-Hua Liu, Michael Picheny:
Telephone band LVCSR for hearing-impaired users. - Antonio Bonafonte, José B. Mariño:
Using x-gram for efficient speech recognition. - Tatsuya Kawahara, Tetsunori Kobayashi, Kazuya Takeda, Nobuaki Minematsu, Katsunobu Itou, Mikio Yamamoto, Atsushi Yamada, Takehito Utsuro, Kiyohiro Shikano:
Sharable software repository for Japanese large vocabulary continuous speech recognition. - Katunobu Itou, Mikio Yamamoto, Kazuya Takeda, Toshiyuki Takezawa, Tatsuo Matsuoka, Tetsunori Kobayashi, Kiyohiro Shikano, Shuichi Itahashi:
The design of the newspaper-based Japanese large vocabulary continuous speech recognition corpus. - Jun Ogata, Yasuo Ariki:
Indexing and classification of TV news articles based on speech dictation using word bigram. - Man-Hung Siu, Rukmini Iyer, Herbert Gish, Carl Quillen:
Parametric trajectory mixtures for LVCSR.
Speech Technology Applications and Human-Machine Interface 1-3
- Axel Glaeser, Frédéric Bimbot:
Steps toward the integration of speaker recognition in real-world telecom applications. - Hyun-Yeol Chung, Cheol-Jun Hwang, Shi-wook Lee:
A bimodal Korean address entry/retrieval system. - Cristina Delogu, Andrea Di Carlo, Paolo Rotundi, Danilo Sartori:
Usability evaluation of IVR systems with DTMF and ASR. - Pascale Fung, Chi Shun Cheung, Kwok Leung Lam, Wai Kat Liu, Yuen Yee Lo:
SALSA version 1.0: a speech-based web browser for hong kong English. - Andrew N. Pargellis, Qiru Zhou, Antoine Saad, Chin-Hui Lee:
A language for creating speech applications. - Robert Graham, Chris Carter, Brian Mellor:
The use of automatic speech recognition to reduce the interference between concurrent tasks of driving and phoning. - Makoto J. Hirayama, Taro Sugahara, Zhiyong Peng, Junichi Yamazaki:
Interactive listening to structured speech content on the internet. - Cheol-Woo Jo:
MSF format for the representation of speech synchronized moving image. - Pernilla Qvarfordt, Arne Jönsson:
Effects of using speech in timetable information systems for WWW. - Thomas Kemp, Petra Geutner, Michael Schmidt, Borislav Tomaz, Manfred Weber, Martin Westphal, Alex Waibel:
The interactive systems labs view4you video indexing system. - Hyung-Jin Kim, I. Lee Hetherington:
SEMOLE: a robust framework for gathering information from the world wide web. - Lau Bakman, Mads Blidegn, Martin Wittrup, Lars Bo Larsen, Thomas B. Moeslund:
Enhancing a WIMP based interface with speech, gaze tracking and agents. - Christine H. Nakatani, Steve Whittaker, Julia Hirschberg:
Now you hear it, now you don't: empirical studies of audio browsing behavior behavior. - Rongyu Qiao, Youngkyu Choi, Johnson I. Agbinya:
A voice verifier for face/voice based person verification system. - Jordi Robert-Ribes:
On the use of automatic speech recognition for TV captioning. - Ben Serridge:
An undergraduate course on speech recognition based on the CSLU toolkit. - Ping-Fai Yang, Yannis Stylianou:
Real time voice alteration based on linear prediction. - Beng Tiong Tan, Yong Gu, Trevor Thomas:
Evaluation and implementation of a voice-activated dialing system with utterance verification. - Hsin-Min Wang, Bor-Shen Lin, Berlin Chen, Bo-Ren Bai:
Towards a Mandarin voice memo system. - Tsubasa Shinozaki, Masanobu Abe:
Development of CAI system employing synthesized speech responses. - Andreas Kellner, Bernhard Rueber, Hauke Schramm:
Using combined decisions and confidence measures for name recognition in automatic directory assistance systems. - Bruce Buntschuh, Candace A. Kamm, Giuseppe Di Fabbrizio, Alicia Abella, Mehryar Mohri, Shrikanth S. Narayanan, Ilija Zeljkovic, R. Doug Sharp, Jeremy H. Wright, S. Marcus, J. Shaffer, R. Duncan, Jay G. Wilpon:
VPQ: a spoken language interface to large scale directory information. - John Choi, Donald Hindle, Julia Hirschberg, Ivan Magrin-Chagnolleau, Christine H. Nakatani, Fernando C. N. Pereira, Amit Singhal, Steve Whittaker:
SCAN - speech content based audio navigator: a system overview. - Javier Ferreiros, José Colás, Javier Macías Guarasa, Alejandro Ruiz, José Manuel Pardo:
Controlling a HIFI with a continuous speech understanding system. - Lori Lamel, Samir Bennacef, Jean-Luc Gauvain, Hervé Dartigues, Jean-Noël Temem:
User evaluation of the mask kiosk. - Niels Ole Bernsen, Laila Dybkjær:
Is speech the right thing for your application? - Juan Ignacio Godino-Llorente, Santiago Aguilera-Navarro, Sira E. Palazuelos-Cagigas, Alberto Nieto Altuzarra, Pedro Gómez Vilda:
A PC-based tool for helping in diagnosis of pathologic voice. - Kåre Sjölander, Jonas Beskow, Joakim Gustafson, Erland Lewin, Rolf Carlson, Björn Granström:
Web-based educational tools for speech technology. - Stephen Sutton, Ronald A. Cole, Jacques de Villiers, Johan Schalkwyk, Pieter J. E. Vermeulen, Michael W. Macon, Yonghong Yan, Edward C. Kaiser, Brian Rundle, Khaldoun Shobaki, John-Paul Hosom, Alexander Kain, Johan Wouters, Dominic W. Massaro, Michael M. Cohen:
Universal speech tools: the CSLU toolkit. - Ben Serridge, Alejandro Barbosa, Ronald A. Cole, Nora Munive, Alcira Vargas:
Creating a mexican Spanish version of the CSLU toolkit. - Carmen García-Mateo, Qiru Zhou, Chin-Hui Lee, Andrew N. Pargellis:
A voice user interface demonstration system for mexican Spanish.
Language Acquisition 1-2
- Yasuyo Minagawa-Kawai, Shigeru Kiritani:
Non-native productions of Japanese single stops that are too long for one mora unit. - Nobuko Yamada:
The process of generation and development of second language Japanese accentuation. - Seiya Funatsu, Shigeru Kiritani:
Perceptual properties of Russians with Japanese fricatives. - Catia Cucchiarini, Febe de Wet, Helmer Strik, Lou Boves:
Assessment of dutch pronunciation by means of automatic speech recognition technology. - Philippe Langlais, Anne-Marie Öster, Björn Granström:
Phonetic-level mispronunciation detection in non-native Swedish speech. - Reiko Akahane-Yamada, Erik McDermott, Takahiro Adachi, Hideki Kawahara, John S. Pruitt:
Computer-based second language production training by using spectrographic representation and HMM-based speech recognition scores. - Debra M. Hardison:
Spoken word identification by native and nonnative speakers of English: effects of training, modality, context and phonetic environment. - Michael D. Tyler:
The effect of background knowledge on first and second language comprehension difficulty. - Kimiko Tsukada:
Comparison of cross-language coarticulation: English, Japanese and Japanese-accented English. - Satoshi Imaizumi, Hidemi Itoh, Yuji Tamekawa, Toshisada Deguchi, Koichi Mori:
Plasticity of non-native phonetic perception and production: a training study. - Ian Watson:
The relation between perceptual and production categories in acquisition. - Valérie Hazan, Sarah Barrett:
The development of perceptual cue-weighting in children aged 6 to 12.
Acoustic Phonetics 1-2
- Anne Cutler, Takashi Otake:
Assimilation of place in Japanese and dutch. - Yuko Kondo, Yumiko Arai:
Prosodic constraint on v-to-v coarticulation in Japanese. - Catia Cucchiarini, Henk van den Heuvel:
Postvocalic /r/-deletion in standard dutch: how experimental phonology can profit from ASR technology. - John Hajek, Ian Watson:
More evidence for the perceptual basis of sound change? suprasegmental effects in the development of distinctive nasalization. - Jianwu Dang, Kiyoshi Honda:
Speech production of vowel sequences using a physiological articulatory model. - Felicity Cox, Sallyanne Palethorpe:
Regional variation in the vowels of female adolescents from sydney. - Catherine I. Watson, Jonathan Harrington, Sallyanne Palethorpe:
A kinematic analysis of new zealand and australian English vowel spaces. - Noël Nguyen, Sarah Hawkins:
Syllable-onset acoustic properties associated with syllable-coda voicing. - Noël Nguyen, Alan Wrench, Fiona Gibbon, William J. Hardcastle:
Articulatory, acoustic and perceptual aspects of fricative-stop coarticulation. - R. J. J. H. van Son, Florien J. Koopmans-van Beinum, Louis C. W. Pols:
Efficiency as an organizing principle of natural speech. - Inger Karlsson, Tanja Bänziger, Jana Dankovicová, Tom Johnstone, Johan Lindberg, Håkan Melin, Francis Nolan, Klaus R. Scherer:
Within-speaker variability due to speaking manners.
Speaker Adaptation 2-3
- Roland Kuhn, Patrick Nguyen, Jean-Claude Junqua, Lloyd Goldwasser, Nancy Niedzielski, Steven Fincke, Kenneth L. Field, Matteo Contolini:
Eigenvoices for speaker adaptation. - Sue E. Johnson, Philip C. Woodland:
Speaker clustering using direct maximisation of the MLLR-adapted likelihood. - Olli Viikki, Kari Laurila:
Incremental on-line speaker adaptation in adverse conditions. - Mark J. F. Gales:
Cluster adaptive training for speech recognition. - Jen-Tzung Chien:
On-line hierarchical transformation of hidden Markov models for speaker adaptation. - Motoyuki Suzuki, Toshiaki Abe, Hiroki Mori, Shozo Makino, Hirotomo Aso:
High-speed speaker adaptation using phoneme dependent tree-structured speaker clustering. - Tasos Anastasakos, Sreeram V. Balakrishnan:
The use of confidence measures in unsupervised adaptation of speech recognizers. - John W. McDonough, William Byrne, Xiaoqiang Luo:
Speaker normalization with all-pass transforms. - Rong Zheng, Zuoying Wang:
Toward on-line learning of Chinese continuous speech recognition system. - Sharon L. Oviatt:
The CHAM model of hyperarticulate adaptation during human-computer error resolution.
Multilingual Perception and Recognition 1
- Ulla Uebler, Michael Schüßler, Heinrich Niemann:
Bilingual and dialectal adaptation and retraining. - Tanja Schultz, Alex Waibel:
Language independent and language adaptive large vocabulary speech recognition. - Goh Kawai, Keikichi Hirose:
A method for measuring the intelligibility and nonnativeness of phone quality in foreign language pronunciation training.
Language Acquisition 3 / Multilingual Perception and Recognition 2
- Peter J. Blamey, Julia Sarant, Tanya Serry, Roger Wales, Christopher James, Johanna Barry, Graeme M. Clark, M. Wright, R. Tooher, C. Psarros, G. Godwin, M. Rennie, T. Meskin:
Speech perception and spoken language in children with impaired hearing. - Catia Cucchiarini, Helmer Strik, Lou Boves:
Quantitative assessment of second language learners' fluency: an automatic approach. - Paul Dalsgaard, Ove Andersen, William J. Barry:
Cross-language merged speech units and their descriptive phonetic correlates. - Robert Eklund, Elizabeth Shriberg:
Crosslinguistic disfluency modelling: a comparative analysis of Swedish and american English human-human and human-machine dialogues. - Horacio Franco, Leonardo Neumeyer:
Calibration of machine scores for pronunciation grading. - Petra Geutner, Michael Finke, Alex Waibel:
Phonetic-distance-based hypothesis driven lexical adaptation for transcribing multlingual broadcast news. - Chul-Ho Jo, Tatsuya Kawahara, Shuji Doshita, Masatake Dantsuji:
Automatic pronunciation error detection and guidance for foreign language learning. - Roger Ho-Yin Leung, Hong C. Leung:
Lexical access for large-vocabulary speech recognition. - Sharlene Liu, Sean Doyle, Allen Morris, Farzad Ehsani:
The effect of fundamental frequency on Mandarin speech recognition. - Duncan Markham:
The perception of nativeness: variable speakers and flexible listeners. - Michael F. McTear, Eamonn A. O'Hare:
Voice dictation in the secondary school classroom. - Kazuo Nakayama, Kaoru Tomita-Nakayama:
The importance of the first syllable in English spoken word recognition by adult Japanese speakers. - Anne-Marie Öster:
Spoken L2 teaching with contrastive visual and auditory feedback. - Dominiek Sandra, Steven Gillis:
The role of phonological, morphological, and orthographic knowledge in the intuitive syllabification of dutch words: a longitudinal approach. - Ayako Shirose, Haruo Kubozono, Shigeru Kiritani:
The acquisition of Japanese compound accent rule. - Lydia K. H. So, Zhou Jing:
The acquisition of putonghua phonology. - Kaoru Tomita-Nakayama, Kazuo Nakayama, Masayuki Misaki:
Enhancing speech processing of Japanese learners of English utilizing time-scale expansion with constant pitch. - Volker Warnke, Elmar Nöth, Jan Buckow, Stefan Harbeck, Heinrich Niemann:
A bootstrap training approach for language model classifiers. - Sandra P. Whiteside, Jeni Marshall:
Voice onset time patterns in 7-, 9- and 11-year old children. - Sandra P. Whiteside, Carolyn Hodgson:
Some developmental patterns in the speech of 6-, 8- and 10-year old children: an acoustic phonetic study. - Lisa-Jane Brown, John Locke, Peter Jones, Sandra P. Whiteside:
Language development after extreme childhood deprivation: a case study. - Geoff Williams, Mark Terry, Jonathan Kaye:
Phonological elements as a basis for language-independent ASR. - Claudio Zmarich, Roberta Lanni:
A phonetic and acoustic study of babbling in an Italian child. - Roland Kuhn, Jean-Claude Junqua, Philip D. Martzen:
Rescoring multiple pronunciations generated from spelled words.
Speech and Hearing Disorders 2 / Speech Processing for the Speech and Hearing Impaired 1
- Yolanda Blanco, Maria Cuellar, Arantxa Villanueva, Fernando Lacunza, Rafael Cabeza, Beatriz Marcotegui:
SIVHA, visual speech synthesis system. - Christel G. de Bruijn, Sandra P. Whiteside, P. A. Cudd, D. Syder, K. M. Rosen, L. Nord:
Using automatic speech recognition and its possible effects on the voice. - Robert Alexander Fearn:
The importance of F0 or voice pitch for perception of tonal language: simulations with cochlear implant speech processing strategies. - Karin Brunnegaard, Katja Laakso, Lena Hartelius, Elisabeth Ahlsén:
Assessing high-level language in individuals with multiple sclerosis: a pilot study. - Shizuo Hiki, Kazuya Imaizumi, Yumiko Fukuda:
Design of cochlear implant device for transmitting voice pitch information in speech sound of asian languages. - Aileen K. Ho, John L. Bradshaw, Robert Iansek, Robin J. Alfredson:
Abnormal volume-duration relationship in parkinsonian speech. - Cheol-Woo Jo, Dae-Hyun Kim:
Analysis of disordered speech signal using wavelet transform. - Shigeyoshi Kitazawa, Hiroyuki Kirihata, Tatsuya Kitamura:
Multi-channel pulsation strategy for electric stimulation of cochlea. - Eva Agelfors, Jonas Beskow, Martin Dahlquist, Björn Granström, Magnus Lundeberg, Karl-Erik Spens, Tobias Öhman:
Synthetic faces as a lipreading support. - Lois Martin, John Bench:
Predicting language scores from the speech perception scores of hearing-impaired children. - Oleg P. Skljarov:
Content-independent duration model on categories of voice and unvoice segments. - Ali-Asghar Soltani-Farani, Edward H. S. Chilton, Robin Shirley:
Dynamical spectrogram, an aid for the deaf. - Rosemary A. Varley, Sandra P. Whiteside:
Evidence of dual-route phonetic encoding from apraxia of speech: implications for phonetic encoding models. - Margaret F. Cheesman, K. L. Smilsky, T. M. Major, F. Lewis, L. M. Boorman:
Speech communication profiles across the adult lifespan: persons without self-identified hearing impairment.
Human Speech Production
- William J. Barry:
Time as a factor in the acoustic variation of schwa. - Hendrik F. V. Boshoff, Elizabeth C. Botha:
On the structure of vowel space: a genealogy of general phonetic concepts. - Véronique Lecuit, Didier Demolin:
The relationship between intensity and subglottal pressure with controlled pitch. - Alain Soquet, Véronique Lecuit, Thierry Metens, Bruno Nazarian, Didier Demolin:
Segmentation of the airway from the surrounding tissues on magnetic resonance images: a comparative study. - Sorin Dusan, Li Deng:
Recovering vocal tract shapes from MFCC parameters. - John H. Esling, Jocelyn Clayards, Jerold A. Edmondson, Qiu Fuyuan, Jimmy G. Harris:
Quantification of pharyngeal articulations using measurements from laryngoscopic images. - Janice Fon:
Variance and invariance in speech rate as a reflection of conceptual planning. - Masako Fujimoto, Emi Z. Murano, Seiji Niimi, Shigeru Kiritani:
Correspondence between the glottal gesture overlap pattern and vowel devoicing in Japanese. - Yukiko Fujisawa, Nobuaki Minematsu, Seiichi Nakagawa:
Evaluation of Japanese manners of generating word accent of English based on a stressed syllable detection technique. - Shunichi Ishihara:
Independence of consonantal voicing and vocoid F0 perturbation in English and Japanese. - Daniel Jurafsky, Alan Bell, Eric Fosler-Lussier, Cynthia Girand, William D. Raymond:
Reduction of English function words in switchboard. - Hee-Sun Kim:
Duration compensation in non-adjacent consonant and temporal regularity. - Keisuke Mori, Yorinobu Sonoda:
Relationship between lip shapes and acoustical characteristics during speech. - Kunitoshi Motoki, Hiroki Matsuzaki:
A model to represent propagation and radiation of higher-order modes for 3-d vocal-tract configuration. - Takuya Niikawa, Masafumi Matsumura, Takashi Tachimura, Takeshi Wada:
FEM analysis of aspirated air flow in three-dimensional vocal tract during fricative consonant phonation. - Takeshi Okadome, Tokihiko Kaburagi, Masaaki Honda:
Trajectory formation of articulatory movements for a given sequence of phonemes. - Chilin Shih, Bernd Möbius:
Contextual effects on voicing profiles of German and Mandarin consonants. - Andrew J. Lundberg, Maureen L. Stone:
Reconstructing the tongue surface from six cross-sectional contours: ultrasound data. - Yasushi Terao, Tadao Murata:
Articulability of two consecutive morae in Japanese speech production: evidence from sound exchange errors in spontaneous speech. - Anne Vilain, Christian Abry, Pierre Badin:
Coarticulation and degrees of freedom in the elaboration of a new articulatory plant: GENTIANE. - Masahiko Wakumoto, Shinobu Masaki, Kiyoshi Honda, Toshikazu Ohue:
A pressure sensitive palatography: application of new pressure sensitive sheet for measuring tongue-palatal contact pressure. - Sandra P. Whiteside, Rosemary A. Varley:
Dual-route phonetic encoding: some acoustic evidence. - Brigitte Zellner:
Fast and slow speech rate: a characterisation for French.
Utterance Verification and Word Spotting 2
- Padma Ramesh, Chin-Hui Lee, Biing-Hwang Juang:
Context dependent anti subword modeling for utterance verification. - J. G. A. Dolfing, Andreas Wendemuth:
Combination of confidence measures in isolated word recognition. - Daniel Willett, Andreas Worm, Christoph Neukirchen, Gerhard Rigoll:
Confidence measures for HMM-based speech recognition. - Li Jiang, Xuedong Huang:
Vocabulary-independent word confidence measure using subword features. - Qiguang Lin, Subrata K. Das, David M. Lubensky, Michael Picheny:
A new confidence measure based on rank-ordering subphone scores. - Tatsuya Kawahara, Kentaro Ishizuka, Shuji Doshita, Chin-Hui Lee:
Speaking-style dependent lexicalized filler model for key-phrase detection and verification.
Speech Processing for the Speech-Impaired and Hearing-Impaired 2
- Paul Duchnowski, Louis D. Braida, Maroula Bratakos, David Lum, Matthew Sexton, Jean C. Krause:
A speechreading aid based on phonetic ASR. - Jan Nouza:
Training speech through visual feedback patterns. - Ichiro Maruyama, Yoshiharu Abe, Takahiro Wakao, Eiji Sawamura, Terumasa Ehara, Katsuhiko Shirai:
Word sequence pair spotting for synchronization of speech and text in production of closed-caption TV programs for the hearing impaired. - Aileen K. Ho, John L. Bradshaw, Robert Iansek, Robin J. Alfredson:
Volume regulation in parkinsonian speech. - Eva Strangert, Mattias Heldner:
On the amount and domain of focal lengthening in Swedish. - Daniel Hirst, Corine Astésano, Albert Di Cristo:
Differential lengthening of syllabic constituents in French: the effect of accent type and speaking style. - Felix C. M. Quimbo, Tatsuya Kawahara, Shuji Doshita:
Prosodic analysis of fillers and self-repair in Japanese speech. - Jinfu Ni, Goh Kawai, Keikichi Hirose:
A synthesis-oriented model of phrasal pitch movements in standard Chinese.
SST Student Day - Poster Sessions
- Qing Guo, Fang Zheng, Jian Wu, Wenhu Wu:
Non-linear probability estimation method used in HMM for modeling frame correlation. - Shuri Kumagai:
Patterns of linguopalatal contact during Japanese vowel devoicing. - Xiao Yu, Guangrui Hu:
Speech separation based on the GMM PDF estimation. - Xiaoqiang Luo:
Growth transform of a sum of rational functions and its application in estimating HMM parameters. - Mirjam Wester, Judith M. Kessens, Helmer Strik:
Two automatic approaches for analyzing connected speech processes in dutch. - Johan Koolwaaij, Johan de Veth:
The use of broad phonetic class models in speaker recognition. - Jorge Miquélez, Rocio Sesma, Yolanda Blanco:
Analysis and treatment of esophageal speech for the enhancement of its comprehension. - Fernando Lacunza, Yolanda Blanco:
High quality text-to-speech system in Spanish for handicapped people. - Corinna Ng, Ross Wilkinson, Justin Zobel:
Factors affecting speech retrieval. - Johan Frid:
Perception of words with vowel reduction. - Ingrid Ahmer, Robin W. King:
Automated captioning of television programs: development and analysis of a soundtrack corpus. - Fabrice Lefèvre, Claude Montacié, Marie-José Caraty:
On the influence of the delta coefficients in a HMM-based speech recognition system. - Raymond Low, Roberto Togneri:
Speech recognition using the probabilistic neural network. - Imed Zitouni:
A language modeling based on a hierarchical approach: m_n^v. - Michiko Watanabe:
Temporal variables in lectures in the Japanese language. - Matthew P. Aylett:
Building a statistical model of the vowel space for phoneticians. - Michelle Minnick Fox:
Computer-mediated input and the acquisition of L2 vowels. - Najam Malik, W. Harvey Holmes:
Speech analysis by subspace methods of spectral line estimation. - Petra Hansson:
Pausing in Swedish spontaneous speech. - Elisabeth Zetterholm:
Prosody and voice quality in the expression of emotions. - Julie Lunn, Alan Wrench, Janet MacKenzie Beck:
Acoustic analysis of /l/ in glossectomees.
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.