default search action
SLT 2012: Miami, FL, USA
- 2012 IEEE Spoken Language Technology Workshop (SLT), Miami, FL, USA, December 2-5, 2012. IEEE 2012, ISBN 978-1-4673-5125-6
- Teruhisa Misu, Hideki Kashioka:
Simultaneous feature selection and parameter optimization for training of dialog policy by reinforcement learning. 1-6 - Filip Jurcícek:
Reinforcement learning for spoken dialogue systems using off-policy natural gradient method. 7-12 - Zhuoran Wang, Oliver Lemon:
A nonparametric Bayesian approach to learning multimodal interaction management. 1-6 - Sajad Shirali-Shahreza, Gerald Penn:
Realistic answer verification: An analysis of user errors in a sentence-repetition task. 19-24 - Svetlana Stoyanchev, Philipp Salletmayr, Jingbo Yang, Julia Hirschberg:
Localized detection of speech recognition errors. 25-30 - Milica Gasic, Matthew Henderson, Blaise Thomson, Pirros Tsiakoulis, Steve J. Young:
Policy optimisation of POMDP-based dialogue systems without state space compression. 31-36 - Blaise Thomson, Milica Gasic, Matthew Henderson, Pirros Tsiakoulis, Steve J. Young:
N-best error simulation for training spoken dialogue systems. 37-42 - Manolis Perakakis, Alexandros Potamianos:
Affective evaluation of a mobile multimodal dialogue system using brain signals. 43-48 - Fabrizio Morbini, Kartik Audhkhasi, Ron Artstein, Maarten Van Segbroeck, Kenji Sagae, Panayiotis G. Georgiou, David R. Traum, Shrikanth S. Narayanan:
A reranking approach for recognition and classification of speech input in conversational dialogue systems. 49-54 - Jason D. Williams:
A critical analysis of two statistical spoken dialog systems in public use. 55-60 - Sungjin Lee, Maxine Eskénazi:
POMDP-based Let's Go system for spoken dialog challenge. 61-66 - Gina-Anne Levow, Siwei Wang:
Employing boosting to compare cues to verbal feedback in multi-lingual dialog. 67-72 - William Yang Wang, Dan Bohus, Ece Kamar, Eric Horvitz:
Crowdsourcing the acquisition of natural language corpora: Methods and observations. 73-78 - Kornel Laskowski:
Exploiting loudness dynamics in stochastic models of turn-taking. 79-84 - Felix Stahlberg, Tim Schlippe, Stephan Vogel, Tanja Schultz:
Word segmentation through cross-lingual word-to-phoneme alignment. 85-90 - Arseniy Gorin, Denis Jouvet:
Class-based speech recognition using a maximum dissimilarity criterion and a tolerance classification margin. 91-96 - Nicolas Obin, Marco Liuni:
On the generalization of Shannon entropy for speech recognition. 97-102 - Shuji Komeiji, Takayuki Arakawa, Takafumi Koshinaka:
A noise-robust speech recognition method composed of weak noise suppression and weak Vector Taylor Series Adaptation. 103-106 - Fabian Triefenbach, Kris Demuynck, Jean-Pierre Martens:
Improving large vocabulary continuous speech recognition by combining GMM-based and reservoir-based acoustic modeling. 107-112 - Atsunori Ogawa, Takaaki Hori, Atsushi Nakamura:
Recognition rate estimation based on word alignment network and discriminative error type classification. 113-118 - Taehwan Kim, Karen Livescu, Gregory Shakhnarovich:
American sign language fingerspelling recognition with phonological feature-based tandem models. 119-124 - Satoshi Kobashikawa, Takaaki Hori, Yoshikazu Yamaguchi, Taichi Asami, Hirokazu Masataki, Satoshi Takahashi:
Efficient prior and incremental beam width control to suppress excessive speech recognition time based on score range estimation. 125-130 - Jinyu Li, Dong Yu, Jui-Ting Huang, Yifan Gong:
Improving wideband speech recognition using mixed-bandwidth training data in CD-DNN-HMM. 131-136 - Cong-Thanh Do, Mohammad Javad Taghizadeh, Philip N. Garner:
Combining cepstral normalization and cochlear implant-like speech processing for microphone array-based speech recognition. 137-142 - Gang Li, Huifeng Zhu, Gong Cheng, Kit Thambiratnam, Behrooz Chitsaz, Dong Yu, Frank Seide:
Context-dependent Deep Neural Networks for audio indexing of real-life data. 143-148 - Yosuke Kashiwagi, Masayuki Suzuki, Nobuaki Minematsu, Keikichi Hirose:
Audio-visual feature integration based on piecewise linear transformation for noise robust automatic speech recognition. 149-152 - Gopala Krishna Anumanchipalli, Luís Caldas de Oliveira, Alan W. Black:
Intent transfer in speech-to-speech machine translation. 153-158 - Alex Marin, Tom Kwiatkowski, Mari Ostendorf, Luke Zettlemoyer:
Using syntactic and confusion network structure for out-of-vocabulary word detection. 159-164 - Md. Akmal Haidar, Douglas D. O'Shaughnessy:
Topic n-gram count language model adaptation for speech recognition. 165-169 - Naoyuki Kanda, Ryu Takeda, Yasunari Obuchi:
Using rhythmic features for Japanese spoken term detection. 170-175 - Matthew Henderson, Milica Gasic, Blaise Thomson, Pirros Tsiakoulis, Kai Yu, Steve J. Young:
Discriminative spoken language understanding using word confusion networks. 176-181 - Hung-yi Lee, Tsung-Hsien Wen, Lin-Shan Lee:
Improved semantic retrieval of spoken content by language models enhanced with acoustic similarity graph. 182-187 - Tsung-Hsien Wen, Hung-yi Lee, Tai-Yuan Chen, Lin-Shan Lee:
Personalized language modeling by crowd sourcing with social network data for voice access of cloud applications. 188-193 - Fernando García, Lluís F. Hurtado, Encarna Segarra, Emilio Sanchis, Giuseppe Riccardi:
Combining multiple translation systems for Spoken Language Understanding portability. 194-198 - Ali Orkan Bayer, Giuseppe Riccardi:
Joint language models for automatic speech recognition and understanding. 199-203 - Teppei Ohno, Tomoyosi Akiba:
Incorporating syllable duration into line-detection-based spoken term detection. 204-209 - Li Deng, Gökhan Tür, Xiaodong He, Dilek Hakkani-Tür:
Use of kernel deep convex networks and end-to-end learning for spoken language understanding. 210-215 - Asli Celikyilmaz, Dilek Hakkani-Tür, Gökhan Tür:
Statistical semantic interpretation modeling for spoken language understanding with enriched semantic features. 216-221 - Timothy J. Hazen, Fred Richardson:
Modeling multiword phrases with constrained phrase trees for improved topic modeling of conversational speech. 222-227 - Larry P. Heck, Dilek Hakkani-Tür:
Exploiting the Semantic Web for unsupervised spoken language understanding. 228-233 - Tomás Mikolov, Geoffrey Zweig:
Context dependent recurrent neural network language model. 234-239 - Florian Hinterleitner, Christoph Norrenbrock, Sebastian Möller, Ulrich Heute:
What makes this voice sound so bad? A multidimensional analysis of state-of-the-art text-to-speech systems. 240-245 - Pawel Swietojanski, Arnab Ghoshal, Steve Renals:
Unsupervised cross-lingual knowledge transfer in DNN-based LVCSR. 246-251 - Maria Astrinaki, Nicolas D'Alessandro, Benjamin Picart, Thomas Drugman, Thierry Dutoit:
Reactive and continuous control of HMM-based speech synthesis. 252-257 - Oliver Jokisch, Yitagessu Birhanu, Rüdiger Hoffmann:
Syllable-based prosodic analysis of Amharic read speech. 252-257 - David Imseng, Hervé Bourlard, Holger Caesar, Philip N. Garner, Gwénolé Lecorvé, Alexandre Nanchen:
MediaParl: Bilingual mixed language accented speech database. 263-268 - Jianbo Jiang, Zhiyong Wu, Mingxing Xu, Jia Jia, Lianhong Cai:
Comparison of adaptation methods for GMM-SVM based speech emotion recognition. 269-273 - Mireia Díez, Amparo Varona, Mikel Peñagarikano, Luis Javier Rodríguez-Fuentes, Germán Bordel:
On the use of phone log-likelihood ratios as features in spoken language recognition. 274-279 - Marc Ferras, Herve Boudard:
Speaker diarization and linking of large corpora. 280-285 - Adriana Stan, Peter Bell, Simon King:
A grapheme-based method for automatic alignment of speech and text data. 286-290 - Benjamin Picart, Thomas Drugman, Thierry Dutoit:
Statistical methods for varying the degree of articulation in new HMM-based voices. 291-296 - Éva Székely, Tamás Gábor Csapó, Bálint Tóth, Péter Mihajlik, Julie Carson-Berndsen:
Synthesizing expressive speech from amateur audiobook recordings. 297-302 - Kyu Jeong Han, Jason W. Pelecanos:
Frame-based phonotactic Language Identification. 303-306 - Sriram Ganapathy, Mohamed Kamal Omar, Jason W. Pelecanos:
Noisy channel adaptation in language identification. 307-312 - Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki:
Exemplar-based voice conversion in noisy environment. 313-317 - L. Paola García-Perera, Juan Arturo Nolazco-Flores, Bhiksha Raj, Richard M. Stern:
Optimization of the DET curve in speaker verification. 318-323 - Peter Bell, Mark J. F. Gales, Pierre Lanchantin, Xunying Liu, Yanhua Long, Steve Renals, Pawel Swietojanski, Philip C. Woodland:
Transcription of multi-genre media archives using out-of-domain data. 324-329 - Mohamed Bouallegue, Emmanuel Ferreira, Driss Matrouf, Georges Linarès, Maria Goudi, Pascal Nocera:
Acoustic modeling for under-resourced languages based on vectorial HMM-states representation using Subspace Gaussian Mixture Models. 330-335 - Karel Veselý, Martin Karafiát, Frantisek Grézl, Milos Janda, Ekaterina Egorova:
The language-independent bottleneck features. 336-341 - Stefan Ziegler, Bogdan Ludusan, Guillaume Gravier:
Towards a new speech event detection approach for landmark-based speech recognition. 342-347 - João Miranda, João Paulo Neto, Alan W. Black:
Recovery of acronyms, out-of-lattice words and pronunciations from parallel multilingual speech. 348-353 - Daniel Bolaños:
The Bavieca open-source speech recognition toolkit. 354-359 - Udhyakumar Nallasamy, Florian Metze, Tanja Schultz:
Active learning for accent adaptation in Automatic Speech Recognition. 360-365 - Kaisheng Yao, Dong Yu, Frank Seide, Hang Su, Li Deng, Yifan Gong:
Adaptation of context-dependent deep neural networks for automatic speech recognition. 366-369 - Leonardo Badino, Claudia Canevari, Luciano Fadiga, Giorgio Metta:
Deep-level acoustic-to-articulatory mapping for DBN-HMM based phone recognition. 370-375 - Andrew Rosenberg:
Modeling intensity contours and the interaction of pitch and intensity to improve automatic prosodic event detection and classification. 376-381 - Ann Lee, James R. Glass:
A comparison-based approach to mispronunciation detection. 382-387 - Mostafa Ali Shahin, Beena Ahmed, Kirrie J. Ballard:
Automatic classification of unequal lexical stress patterns using machine learning algorithms. 388-391 - Korbinian Riedhammer, Martin Gropp, Elmar Nöth:
The FAU Video Lecture Browser system. 392-397 - Ghada AlHarbi, Thomas Hain:
Automatic transcription of academic lectures from diverse disciplines. 398-403 - Heather Friedberg, Diane J. Litman, Susannah B. F. Paletz:
Lexical entrainment and success in student engineering groups. 404-409 - Sandrine Brognaux, Thomas Drugman, Richard Beaufort:
Automatic detection and correction of syntax-based prosody annotation errors. 410-415 - Sandrine Brognaux, Sophie Roekhaut, Thomas Drugman, Richard Beaufort:
Train&align: A new online tool for automatic phonetic alignment. 416-421 - Luiza Orosanu, Denis Jouvet, Dominique Fohr, Irina Illina, Anne Bonneau:
Combining criteria for the detection of incorrect entries of non-native speech in the context of foreign language learning. 422-427 - Yi Luan, Masayuki Suzuki, Yutaka Yamauchi, Nobuaki Minematsu, Shuhei Kato, Keikichi Hirose:
Performance improvement of automatic pronunciation assessment in a noisy classroom. 428-431 - Sechun Kang, Gary Geunbae Lee, Ho-Young Lee, Byeongchang Kim:
An automatic pitch accent feedback system for english learners with adaptation of an english corpus spoken by Koreans. 432-437 - Meysam Asgari, Izhak Shafran, Alireza Bayestehtashk:
Robust detection of voiced segments in samples of everyday conversations using unsupervised HMMS. 438-442 - Kyusong Lee, Soo-Ok Kweon, Hongsuck Seo, Gary Geunbae Lee:
Generating grammar questions using corpus data in L2 learning. 443-448 - Ian Kaplan, Andrew Rosenberg:
Analysis of speech transcripts to predict winners of U.S. Presidential and Vice-Presidential debates. 449-454 - Na Yang, R. Muraleedharan, J. Kohl, Ilker Demirkol, Wendi Rabiner Heinzelman, Melissa Sturge-Apple:
Speech-based emotion classification using multiclass SVM with hybrid kernel and thresholding fusion. 455-460 - Yun-Nung Chen, Florian Metze:
Two-layer mutually reinforced random walk for improved multi-party meeting summarization. 461-466 - Anthony McCallum, Gerald Penn, Cosmin Munteanu, Xiaodan Zhu:
Ecological validity and the evaluation of speech summarization quality. 467-472 - Tongmu Zhao, Akemi Hoshino, Masayuki Suzuki, Nobuaki Minematsu, Keikichi Hirose:
Automatic Chinese pronunciation error detection using SVM trained with structural features. 473-478 - Deana Pennell, Yang Liu:
Evaluating the effect of normalizing informal text on TTS output. 479-483
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.