default search action
Hsin-Min Wang
Person information
- affiliation: Academia Sinica, Taipei, Taiwan
- affiliation (PhD 1995): National Taiwan University, Taipei, Taiwan
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c245]Hsuan-Fu Wang, Yi-Jen Shih, Heng-Jui Chang, Layne Berry, Puyuan Peng, Hung-Yi Lee, Hsin-Min Wang, David Harwath:
SpeechCLIP+: Self-Supervised Multi-Task Representation Learning for Speech Via Clip and Speech-Image Data. ICASSP Workshops 2024: 465-469 - [c244]Ryandhimas E. Zezario, Bo-Ren Brian Bai, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao:
Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model. ICASSP 2024: 831-835 - [c243]Ryandhimas E. Zezario, Yu-Wen Chen, Szu-Wei Fu, Yu Tsao, Hsin-Min Wang, Chiou-Shann Fuh:
A Study On Incorporating Whisper For Robust Speech Assessment. ICME 2024: 1-6 - [i84]Dyah A. M. G. Wisnu, Epri W. Pratiwi, Stefano Rini, Ryandhimas E. Zezario, Hsin-Min Wang, Yu Tsao:
HAAQI-Net: A non-intrusive neural music quality assessment model for hearing aids. CoRR abs/2401.01145 (2024) - [i83]Hsuan-Fu Wang, Yi-Jen Shih, Heng-Jui Chang, Layne Berry, Puyuan Peng, Hung-yi Lee, Hsin-Min Wang, David Harwath:
SpeechCLIP+: Self-supervised multi-task representation learning for speech via CLIP and speech-image data. CoRR abs/2402.06959 (2024) - [i82]Ammarah Hashmi, Sahibzada Adil Shahzad, Chia-Wen Lin, Yu Tsao, Hsin-Min Wang:
Unmasking Illusions: Understanding Human Perception of Audiovisual Deepfakes. CoRR abs/2405.04097 (2024) - [i81]Chun Yin, Tai-Shih Chi, Yu Tsao, Hsin-Min Wang:
SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models. CoRR abs/2406.08445 (2024) - [i80]Chien-Chun Wang, Li-Wei Chen, Hung-Shin Lee, Berlin Chen, Hsin-Min Wang:
Effective Noise-aware Data Simulation for Domain-adaptive Speech Enhancement Leveraging Dynamic Stochastic Perturbation. CoRR abs/2409.01545 (2024) - [i79]Wen-Chin Huang, Szu-Wei Fu, Erica Cooper, Ryandhimas E. Zezario, Tomoki Toda, Hsin-Min Wang, Junichi Yamagishi, Yu Tsao:
The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction. CoRR abs/2409.07001 (2024) - [i78]Yao-Fei Cheng, Li-Wei Chen, Hung-Shin Lee, Hsin-Min Wang:
Exploring the Impact of Data Quantity on ASR in Extremely Low-resource Languages. CoRR abs/2409.08872 (2024) - [i77]Ryandhimas E. Zezario, Sabato Marco Siniscalchi, Hsin-Min Wang, Yu Tsao:
A Study on Zero-shot Non-intrusive Speech Assessment using Large Language Models. CoRR abs/2409.09914 (2024) - [i76]Wenze Ren, Haibin Wu, Yi-Cheng Lin, Xuanjun Chen, Rong Chao, Kuo-Hsuan Hung, You-Jin Li, Wen-Yuan Ting, Hsin-Min Wang, Yu Tsao:
Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement. CoRR abs/2409.10376 (2024) - [i75]Chien-Chun Wang, Li-Wei Chen, Cheng-Kang Chou, Hung-Shin Lee, Berlin Chen, Hsin-Min Wang:
Channel-Aware Domain-Adaptive Generative Adversarial Network for Robust Speech Recognition. CoRR abs/2409.12386 (2024) - [i74]Wenze Ren, Kuo-Hsuan Hung, Rong Chao, You-Jin Li, Hsin-Min Wang, Yu Tsao:
Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing. CoRR abs/2409.14554 (2024) - 2023
- [j73]Chin-Yi Cheng, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang:
Multi-Target Extractor and Detector for Unknown-Number Speaker Diarization. IEEE Signal Process. Lett. 30: 638-642 (2023) - [j72]Ryandhimas E. Zezario, Szu-Wei Fu, Fei Chen, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao:
Deep Learning-Based Non-Intrusive Multi-Objective Speech Assessment Model With Cross-Domain Features. IEEE ACM Trans. Audio Speech Lang. Process. 31: 54-70 (2023) - [j71]Qian-Bei Hong, Chung-Hsien Wu, Hsin-Min Wang:
Generalization Ability Improvement of Speaker Representation and Anti-Interference for Speaker Verification. IEEE ACM Trans. Audio Speech Lang. Process. 31: 486-499 (2023) - [j70]Qian-Bei Hong, Chung-Hsien Wu, Hsin-Min Wang:
Decomposition and Reorganization of Phonetic Information for Speaker Embedding Learning. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1745-1757 (2023) - [c242]Erica Cooper, Wen-Chin Huang, Yu Tsao, Hsin-Min Wang, Tomoki Toda, Junichi Yamagishi:
The Voicemos Challenge 2023: Zero-Shot Subjective Speech Quality Prediction for Multiple Domains. ASRU 2023: 1-7 - [c241]Chi-Chang Lee, Hong-Wei Chen, Chu-Song Chen, Hsin-Min Wang, Tsung-Te Liu, Yu Tsao:
LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models. ASRU 2023: 1-8 - [c240]Chi-Chang Lee, Yu Tsao, Hsin-Min Wang, Chu-Song Chen:
D4AM: A General Denoising Framework for Downstream Acoustic Models. ICLR 2023 - [c239]Li-Wei Chen, Yao-Fei Cheng, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang:
A Training and Inference Strategy Using Noisy and Enhanced Speech as Target for Speech Enhancement without Clean Speech. INTERSPEECH 2023: 2473-2477 - [c238]Hsin-Hao Chen, Yung-Lun Chien, Ming-Chi Yen, Shu-Wei Tsai, Tai-Shih Chi, Hsin-Min Wang, Yu Tsao:
Mandarin Electrolaryngeal Speech Voice Conversion using Cross-domain Features. INTERSPEECH 2023: 5018-5022 - [c237]Yung-Lun Chien, Hsin-Hao Chen, Ming-Chi Yen, Shu-Wei Tsai, Hsin-Min Wang, Yu Tsao, Tai-Shih Chi:
Audio-Visual Mandarin Electrolaryngeal Speech Voice Conversion. INTERSPEECH 2023: 5023-5026 - [i73]Yu-Wen Chen, Hsin-Min Wang, Yu Tsao:
BASPRO: a balanced script producer for speech corpus collection based on the genetic algorithm. CoRR abs/2301.04120 (2023) - [i72]Yung-Lun Chien, Hsin-Hao Chen, Ming-Chi Yen, Shu-Wei Tsai, Hsin-Min Wang, Yu Tsao, Tai-Shih Chi:
Audio-Visual Mandarin Electrolaryngeal Speech Voice Conversion. CoRR abs/2306.06652 (2023) - [i71]Hsin-Hao Chen, Yung-Lun Chien, Ming-Chi Yen, Shu-Wei Tsai, Yu Tsao, Tai-Shih Chi, Hsin-Min Wang:
Mandarin Electrolaryngeal Speech Voice Conversion using Cross-domain Features. CoRR abs/2306.06653 (2023) - [i70]Ryandhimas E. Zezario, Bo-Ren Brian Bai, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao:
Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model. CoRR abs/2308.09262 (2023) - [i69]Ryandhimas E. Zezario, Fei Chen, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao:
Utilizing Whisper to Enhance Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids. CoRR abs/2309.09548 (2023) - [i68]Shafique Ahmed, Chia-Wei Chen, Wenze Ren, Chin-Jou Li, Ernie Chu, Jun-Cheng Chen, Amir Hussain, Hsin-Min Wang, Yu Tsao, Jen-Cheng Hou:
Deep Complex U-Net with Conformer for Audio-Visual Speech Enhancement. CoRR abs/2309.11059 (2023) - [i67]Ryandhimas E. Zezario, Yu-Wen Chen, Szu-Wei Fu, Yu Tsao, Hsin-Min Wang, Chiou-Shann Fuh:
A Study on Incorporating Whisper for Robust Speech Assessment. CoRR abs/2309.12766 (2023) - [i66]Ammarah Hashmi, Sahibzada Adil Shahzad, Chia-Wen Lin, Yu Tsao, Hsin-Min Wang:
AVTENet: Audio-Visual Transformer-based Ensemble Network Exploiting Multiple Experts for Video Deepfake Detection. CoRR abs/2310.13103 (2023) - [i65]Sahibzada Adil Shahzad, Ammarah Hashmi, Yan-Tsung Peng, Yu Tsao, Hsin-Min Wang:
AV-Lip-Sync+: Leveraging AV-HuBERT to Exploit Multimodal Inconsistency for Video Deepfake Detection. CoRR abs/2311.02733 (2023) - [i64]Hsin-Tien Chiang, Szu-Wei Fu, Hsin-Min Wang, Yu Tsao, John H. L. Hansen:
Multi-objective Non-intrusive Hearing-aid Speech Assessment Model. CoRR abs/2311.08878 (2023) - [i63]Chi-Chang Lee, Yu Tsao, Hsin-Min Wang, Chu-Song Chen:
D4AM: A General Denoising Framework for Downstream Acoustic Models. CoRR abs/2311.16595 (2023) - [i62]Chi-Chang Lee, Hong-Wei Chen, Chu-Song Chen, Hsin-Min Wang, Tsung-Te Liu, Yu Tsao:
LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models. CoRR abs/2311.16604 (2023) - 2022
- [j69]Cheng-Hung Hu, Yu-Huai Peng, Junichi Yamagishi, Yu Tsao, Hsin-Min Wang:
SVSNet: An End-to-End Speaker Voice Similarity Assessment Model. IEEE Signal Process. Lett. 29: 767-771 (2022) - [j68]Shang-Yi Chuang, Hsin-Min Wang, Yu Tsao:
Improved Lite Audio-Visual Speech Enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 30: 1345-1359 (2022) - [c236]Kuan-Chen Wang, Kai-Chun Liu, Hsin-Min Wang, Yu Tsao:
EMGSE: Acoustic/EMG Fusion for Multimodal Speech Enhancement. ICASSP 2022: 1116-1120 - [c235]Haibin Wu, Heng-Cheng Kuo, Naijun Zheng, Kuo-Hsuan Hung, Hung-Yi Lee, Yu Tsao, Hsin-Min Wang, Helen Meng:
Partially Fake Audio Detection by Self-Attention-Based Fake Span Discovery. ICASSP 2022: 9236-9240 - [c234]Chi-Chang Lee, Cheng-Hung Hu, Yu-Chen Lin, Chu-Song Chen, Hsin-Min Wang, Yu Tsao:
NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional Resampling. INTERSPEECH 2022: 1183-1187 - [c233]Hung-Shin Lee, Pin-Tuan Huang, Yao-Fei Cheng, Hsin-Min Wang:
Chain-based Discriminative Autoencoders for Speech Recognition. INTERSPEECH 2022: 2078-2082 - [c232]Ryandhimas Edo Zezario, Fei Chen, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao:
MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids. INTERSPEECH 2022: 3944-3948 - [c231]Wen-Chin Huang, Erica Cooper, Yu Tsao, Hsin-Min Wang, Tomoki Toda, Junichi Yamagishi:
The VoiceMOS Challenge 2022. INTERSPEECH 2022: 4536-4540 - [c230]Fan-Lin Wang, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang:
Disentangling the Impacts of Language and Channel Variability on Speech Separation Networks. INTERSPEECH 2022: 5343-5347 - [c229]Ryandhimas Edo Zezario, Szu-Wei Fu, Fei Chen, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao:
MTI-Net: A Multi-Target Speech Intelligibility Prediction Model. INTERSPEECH 2022: 5463-5467 - [c228]Hung-Shin Lee, Pin-Yuan Chen, Yao-Fei Cheng, Yu Tsao, Hsin-Min Wang:
Speech-enhanced and Noise-aware Networks for Robust Speech Recognition. ISCSLP 2022: 145-149 - [c227]Shang-Bao Luo, Cheng-Chung Fan, Kuan-Yu Chen, Yu Tsao, Hsin-Min Wang, Keh-Yih Su:
Chinese Movie Dialogue Question Answering Dataset. ROCLING 2022: 7-14 - [c226]Aleksandra Smolka, Hsin-Min Wang, Jason S. Chang, Keh-Yih Su:
Is Character Trigram Overlapping Ratio Still the Best Similarity Measure for Aligning Sentences in a Paraphrased Corpus? ROCLING 2022: 49-60 - [i61]Kuan-Chen Wang, Kai-Chun Liu, Hsin-Min Wang, Yu Tsao:
EMGSE: Acoustic/EMG Fusion for Multimodal Speech Enhancement. CoRR abs/2202.06507 (2022) - [i60]Haibin Wu, Heng-Cheng Kuo, Naijun Zheng, Kuo-Hsuan Hung, Hung-Yi Lee, Yu Tsao, Hsin-Min Wang, Helen Meng:
Partially Fake Audio Detection by Self-attention-based Fake Span Discovery. CoRR abs/2202.06684 (2022) - [i59]Wen-Chin Huang, Erica Cooper, Yu Tsao, Hsin-Min Wang, Tomoki Toda, Junichi Yamagishi:
The VoiceMOS Challenge 2022. CoRR abs/2203.11389 (2022) - [i58]Hung-Shin Lee, Pin-Tuan Huang, Yao-Fei Cheng, Hsin-Min Wang:
Chain-based Discriminative Autoencoders for Speech Recognition. CoRR abs/2203.13687 (2022) - [i57]Hung-Shin Lee, Pin-Yuan Chen, Yu Tsao, Hsin-Min Wang:
Speech-enhanced and Noise-aware Networks for Robust Speech Recognition. CoRR abs/2203.13696 (2022) - [i56]Hung-Shin Lee, Yu Tsao, Shyh-Kang Jeng, Hsin-Min Wang:
Subspace-based Representation and Learning for Phonotactic Spoken Language Recognition. CoRR abs/2203.15576 (2022) - [i55]Chin-Yi Cheng, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang:
Multi-Target Filter and Detector for Speaker Diarization. CoRR abs/2203.16007 (2022) - [i54]Fan-Lin Wang, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang:
Disentangling the Impacts of Language and Channel Variability on Speech Separation Networks. CoRR abs/2203.16040 (2022) - [i53]Yu-Huai Peng, Hung-Shin Lee, Pin-Tuan Huang, Hsin-Min Wang:
Generation of Speaker Representations Using Heterogeneous Training Batch Assembly. CoRR abs/2203.16646 (2022) - [i52]Chiang-Lin Tai, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang:
Filter-based Discriminative Autoencoders for Children Speech Recognition. CoRR abs/2204.00164 (2022) - [i51]Ryandhimas E. Zezario, Fei Chen, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao:
MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids. CoRR abs/2204.03305 (2022) - [i50]Ryandhimas E. Zezario, Szu-Wei Fu, Fei Chen, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao:
MTI-Net: A Multi-Target Speech Intelligibility Prediction Model. CoRR abs/2204.03310 (2022) - [i49]Shih-Kuang Lee, Yu Tsao, Hsin-Min Wang:
A Study of Using Cepstrogram for Countermeasure Against Replay Attacks. CoRR abs/2204.04333 (2022) - [i48]Chi-Chang Lee, Cheng-Hung Hu, Yu-Chen Lin, Chu-Song Chen, Hsin-Min Wang, Yu Tsao:
NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional Resampling. CoRR abs/2206.09058 (2022) - [i47]Yin-Ping Cho, Yu Tsao, Hsin-Min Wang, Yi-Wen Liu:
Mandarin Singing Voice Synthesis with Denoising Diffusion Probabilistic Wasserstein GAN. CoRR abs/2209.10446 (2022) - [i46]Li-Wei Chen, Yao-Fei Cheng, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang:
A Teacher-student Framework for Unsupervised Speech Enhancement Using Noise Remixing Training and Two-stage Inference. CoRR abs/2210.15368 (2022) - [i45]Fan-Lin Wang, Yao-Fei Cheng, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang:
CasNet: Investigating Channel Robustness for Speech Separation. CoRR abs/2210.15370 (2022) - 2021
- [j67]Wen-Li Wei, Jen-Chun Lin, Tyng-Luh Liu, Hsiao-Rong Tyan, Hsin-Min Wang, Hong-Yuan Mark Liao:
Learning to Visualize Music Through Shot Sequence for Automatic Concert Video Mashup. IEEE Trans. Multim. 23: 1731-1743 (2021) - [c225]Shih-hung Tsai, Chao-Chun Liang, Hsin-Min Wang, Keh-Yih Su:
Sequence to General Tree: Knowledge-Guided Geometry Word Problem Solving. ACL/IJCNLP (2) 2021: 964-972 - [c224]Qian-Bei Hong, Chung-Hsien Wu, Thanh Binh Nguyen, Hsin-Min Wang:
Improvement of Spatial Ambiguity in Multi-Channel Speech Separation Using Channel Attention. APSIPA ASC 2021: 619-623 - [c223]Yu-Huai Peng, Hung-Shin Lee, Pin-Tuan Huang, Hsin-Min Wang:
Generation of Speaker Representations Using Heterogeneous Training Batch Assembly. APSIPA ASC 2021: 719-724 - [c222]Yi-Syuan Liou, Wen-Chin Huang, Ming-Chi Yen, Shu-Wei Tsai, Yu-Huai Peng, Tomoki Toda, Yu Tsao, Hsin-Min Wang:
Time Alignment using Lip Images for Frame-based Electrolaryngeal Voice Conversion. APSIPA ASC 2021: 1234-1238 - [c221]Ming-Chi Yen, Wen-Chin Huang, Kazuhiro Kobayashi, Yu-Huai Peng, Shu-Wei Tsai, Yu Tsao, Tomoki Toda, Jyh-Shing Roger Jang, Hsin-Min Wang:
Mandarin Electrolaryngeal Speech Voice Conversion with Sequence-to-Sequence Modeling. ASRU 2021: 650-657 - [c220]Hsin-Tien Chiang, Yi-Chiao Wu, Cheng Yu, Tomoki Toda, Hsin-Min Wang, Yih-Chun Hu, Yu Tsao:
HASA-Net: A Non-Intrusive Hearing-Aid Speech Assessment Network. ASRU 2021: 907-913 - [c219]Ryandhimas E. Zezario, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao:
Speech Enhancement with Zero-Shot Model Selection. EUSIPCO 2021: 491-495 - [c218]Chung-En Sun, Yi-Wei Chen, Hung-Shin Lee, Yen-Hsing Chen, Hsin-Min Wang:
Melody Harmonization Using Orderless Nade, Chord Balancing, and Blocked Gibbs Sampling. ICASSP 2021: 4145-4149 - [c217]Wen-Chin Huang, Chia-Hua Wu, Shang-Bao Luo, Kuan-Yu Chen, Hsin-Min Wang, Tomoki Toda:
Speech Recognition by Simply Fine-Tuning Bert. ICASSP 2021: 7343-7347 - [c216]Wen-Chin Huang, Kazuhiro Kobayashi, Yu-Huai Peng, Ching-Feng Liu, Yu Tsao, Hsin-Min Wang, Tomoki Toda:
A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker Identity in Dysarthric Voice Conversion. Interspeech 2021: 1329-1333 - [c215]Yao-Fei Cheng, Hung-Shin Lee, Hsin-Min Wang:
AlloST: Low-Resource Speech Translation Without Source Transcription. Interspeech 2021: 2252-2256 - [c214]Fan-Lin Wang, Yu-Huai Peng, Hung-Shin Lee, Hsin-Min Wang:
Dual-Path Filter Network: Speaker-Aware Modeling for Speech Separation. Interspeech 2021: 3061-3065 - [c213]Yi-Chiao Wu, Cheng-Hung Hu, Hung-Shin Lee, Yu-Huai Peng, Wen-Chin Huang, Yu Tsao, Hsin-Min Wang, Tomoki Toda:
Relational Data Selection for Data Augmentation of Speaker-Dependent Multi-Band MelGAN Vocoder. Interspeech 2021: 3630-3634 - [c212]Yu-Tao Chang, Yuan-Hong Yang, Yu-Huai Peng, Syu-Siang Wang, Tai-Shih Chi, Yu Tsao, Hsin-Min Wang:
MoEVC: A Mixture of Experts Voice Conversion System With Sparse Gating Mechanism for Online Computation Acceleration. ISCSLP 2021: 1-5 - [c211]Yi-Wei Chen, Hung-Shin Lee, Yen-Hsing Chen, Hsin-Min Wang:
SurpriseNet: Melody Harmonization Conditioning on User-controlled Surprise Contours. ISMIR 2021: 105-112 - [c210]Md Mahbub E. Noor, Yen-Ju Lu, Syu-Siang Wang, Supratip Ghose, Chia-Yu Chang, Ryandhimas E. Zezario, Shafique Ahmed, Wei-Ho Chung, Yu Tsao, Hsin-Min Wang:
Investigation of a Single-Channel Frequency-Domain Speech Enhancement Network to Improve End-to-End Bengali Automatic Speech Recognition Under Unseen Noisy Conditions. O-COCOSDA 2021: 7-12 - [c209]Cheng-Chung Fan, Chia-Chih Kuo, Shang-Bao Luo, Pei-Jun Liao, Kuang-Yu Chang, Chiao-Wei Hsu, Meng-Tse Wu, Shih-Hong Tsai, Tzu-Man Wu, Aleksandra Smolka, Chao-Chun Liang, Hsin-Min Wang, Kuan-Yu Chen, Yu Tsao, Keh-Yih Su:
A Flexible and Extensible Framework for Multiple Answer Modes Question Answering. ROCLING 2021: 33-42 - [c208]Shih-hung Tsai, Chao-Chun Liang, Hsin-Min Wang, Keh-Yih Su:
Mining Commonsense and Domain Knowledge from Math Word Problems. ROCLING 2021: 111-117 - [i44]Wen-Chin Huang, Chia-Hua Wu, Shang-Bao Luo, Kuan-Yu Chen, Hsin-Min Wang, Tomoki Toda:
Speech Recognition by Simply Fine-tuning BERT. CoRR abs/2102.00291 (2021) - [i43]Cheng-Hung Hu, Yi-Chiao Wu, Wen-Chin Huang, Yu-Huai Peng, Yu-Wen Chen, Pin-Jui Ku, Tomoki Toda, Yu Tsao, Hsin-Min Wang:
The AS-NU System for the M2VoC Challenge. CoRR abs/2104.03009 (2021) - [i42]Yao-Fei Cheng, Hung-Shin Lee, Hsin-Min Wang:
AlloST: Low-resource Speech Translation without Source Transcription. CoRR abs/2105.00171 (2021) - [i41]Shih-hung Tsai, Chao-Chun Liang, Hsin-Min Wang, Keh-Yih Su:
Sequence to General Tree: Knowledge-Guided Geometry Word Problem Solving. CoRR abs/2106.00990 (2021) - [i40]Wen-Chin Huang, Kazuhiro Kobayashi, Yu-Huai Peng, Ching-Feng Liu, Yu Tsao, Hsin-Min Wang, Tomoki Toda:
A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker Identity in Dysarthric Voice Conversion. CoRR abs/2106.01415 (2021) - [i39]Cheng-Hung Hu, Yu-Huai Peng, Junichi Yamagishi, Yu Tsao, Hsin-Min Wang:
SVSNet: An End-to-end Speaker Voice Similarity Assessment Model. CoRR abs/2107.09392 (2021) - [i38]Yi-Wei Chen, Hung-Shin Lee, Yen-Hsing Chen, Hsin-Min Wang:
SurpriseNet: Melody Harmonization Conditioning on User-controlled Surprise Contours. CoRR abs/2108.00378 (2021) - [i37]Yi-Syuan Liou, Wen-Chin Huang, Ming-Chi Yen, Shu-Wei Tsai, Yu-Huai Peng, Tomoki Toda, Yu Tsao, Hsin-Min Wang:
Time Alignment using Lip Images for Frame-based Electrolaryngeal Voice Conversion. CoRR abs/2109.03551 (2021) - [i36]Yun-Ju Chan, Chiang-Jen Peng, Syu-Siang Wang, Hsin-Min Wang, Yu Tsao, Tai-Shih Chi:
Speech Enhancement-assisted Stargan Voice Conversion in Noisy Environments. CoRR abs/2110.09923 (2021) - [i35]Ryandhimas E. Zezario, Szu-Wei Fu, Fei Chen, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao:
Deep Learning-based Non-Intrusive Multi-Objective Speech Assessment Model with Cross-Domain Features. CoRR abs/2111.02363 (2021) - [i34]Hsin-Tien Chiang, Yi-Chiao Wu, Cheng Yu, Tomoki Toda, Hsin-Min Wang, Yih-Chun Hu, Yu Tsao:
HASA-net: A non-intrusive hearing-aid speech assessment network. CoRR abs/2111.05691 (2021) - 2020
- [j66]Xin Wang, Junichi Yamagishi, Massimiliano Todisco, Héctor Delgado, Andreas Nautsch, Nicholas W. D. Evans, Md. Sahidullah, Ville Vestman, Tomi Kinnunen, Kong Aik Lee, Lauri Juvela, Paavo Alku, Yu-Huai Peng, Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Sébastien Le Maguer, Markus Becker, Zhen-Hua Ling:
ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech. Comput. Speech Lang. 64: 101114 (2020) - [j65]Tsun-An Hsieh, Hsin-Min Wang, Xugang Lu, Yu Tsao:
WaveCRN: An Efficient Convolutional Recurrent Neural Network for End-to-End Speech Enhancement. IEEE Signal Process. Lett. 27: 2149-2153 (2020) - [j64]Chang-Le Liu, Sze-Wei Fu, You-Jin Li, Jen-Wei Huang, Hsin-Min Wang, Yu Tsao:
Multichannel Speech Enhancement by Raw Waveform-Mapping Using Fully Convolutional Networks. IEEE ACM Trans. Audio Speech Lang. Process. 28: 1888-1900 (2020) - [j63]Cheng Yu, Ryandhimas E. Zezario, Syu-Siang Wang, Jonathan Sherman, Yi-Yen Hsieh, Xugang Lu, Hsin-Min Wang, Yu Tsao:
Speech Enhancement Based on Denoising Autoencoder With Multi-Branched Encoders. IEEE ACM Trans. Audio Speech Lang. Process. 28: 2756-2769 (2020) - [j62]Hung-Shin Lee, Yu Tsao, Shyh-Kang Jeng, Hsin-Min Wang:
Subspace-Based Representation and Learning for Phonotactic Spoken Language Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 28: 3065-3079 (2020) - [j61]Wen-Chin Huang, Hao Luo, Hsin-Te Hwang, Chen-Chou Lo, Yu-Huai Peng, Yu Tsao, Hsin-Min Wang:
Unsupervised Representation Disentanglement Using Cross Domain Features and Adversarial Learning in Variational Autoencoder Based Voice Conversion. IEEE Trans. Emerg. Top. Comput. Intell. 4(4): 468-479 (2020) - [c207]Ryandhimas E. Zezario, Szu-Wei Fu, Chiou-Shann Fuh, Yu Tsao, Hsin-Min Wang:
STOI-Net: A Deep Learning based Non-Intrusive Speech Intelligibility Assessment Model. APSIPA 2020: 482-486 - [c206]Yu-Huai Peng, Cheng-Hung Hu, Alexander Chao-Fu Kang, Hung-Shin Lee, Pin-Yuan Chen, Yu Tsao, Hsin-Min Wang:
The Academia Sinica Systems of Voice Conversion for VCC2020. Blizzard Challenge / Voice Conversion Challenge 2020 - [c205]Hao Yen, Pin-Jui Ku, Ming-Chi Yen, Hung-Shin Lee, Hsin-Min Wang:
Joint Training of Guided Learning and Mean Teacher Models for Sound Event Detection. DCASE 2020: 235-239 - [c204]Ryandhimas E. Zezario, Tassadaq Hussain, Xugang Lu, Hsin-Min Wang, Yu Tsao:
Self-Supervised Denoising Autoencoder with Linear Regression Decoder for Speech Enhancement. ICASSP 2020: 6669-6673 - [c203]Qian-Bei Hong, Chung-Hsien Wu, Hsin-Min Wang, Chien-Lin Huang:
Statistics Pooling Time Delay Neural Network Based on X-Vector for Speaker Verification. ICASSP 2020: 6849-6853 - [c202]Qian-Bei Hong, Chung-Hsien Wu, Hsin-Min Wang, Chien-Lin Huang:
Combining Deep Embeddings of Acoustic and Articulatory Features for Speaker Identification. ICASSP 2020: 7589-7593 - [c201]Shang-Yi Chuang, Yu Tsao, Chen-Chou Lo, Hsin-Min Wang:
Lite Audio-Visual Speech Enhancement. INTERSPEECH 2020: 1131-1135 - [c200]Chi-Chang Lee, Yu-Chen Lin, Hsuan-Tien Lin, Hsin-Min Wang, Yu Tsao:
SERIL: Noise Adaptive Speech Enhancement Using Regularization-Based Incremental Learning. INTERSPEECH 2020: 2432-2436 - [c199]Pin-Yuan Chen, Chia-Hua Wu, Hung-Shin Lee, Shao-Kang Tsao, Ming-Tat Ko, Hsin-Min Wang:
Using Taigi Dramas with Mandarin Chinese Subtitles to Improve Taigi Speech Recognition. O-COCOSDA 2020: 71-76 - [i33]Cheng Yu, Ryandhimas E. Zezario, Jonathan Sherman, Yi-Yen Hsieh, Xugang Lu, Hsin-Min Wang, Yu Tsao:
Speech Enhancement based on Denoising Autoencoder with Multi-branched Encoders. CoRR abs/2001.01538 (2020) - [i32]Wen-Chin Huang, Hao Luo, Hsin-Te Hwang, Chen-Chou Lo, Yu-Huai Peng, Yu Tsao, Hsin-Min Wang:
Unsupervised Representation Disentanglement using Cross Domain Features and Adversarial Learning in Variational Autoencoder based Voice Conversion. CoRR abs/2001.07849 (2020) - [i31]Tsun-An Hsieh, Hsin-Min Wang, Xugang Lu, Yu Tsao:
WaveCRN: An Efficient Convolutional Recurrent Neural Network for End-to-end Speech Enhancement. CoRR abs/2004.04098 (2020) - [i30]Chi-Chang Lee, Yu-Chen Lin, Hsuan-Tien Lin, Hsin-Min Wang, Yu Tsao:
SERIL: Noise Adaptive Speech Enhancement using Regularization-based Incremental Learning. CoRR abs/2005.11760 (2020) - [i29]Shang-Yi Chuang, Yu Tsao, Chen-Chou Lo, Hsin-Min Wang:
Lite Audio-Visual Speech Enhancement. CoRR abs/2005.11769 (2020) - [i28]Shang-Yi Chuang, Hsin-Min Wang, Yu Tsao:
Improved Lite Audio-Visual Speech Enhancement. CoRR abs/2008.13222 (2020) - [i27]Yu-Huai Peng, Cheng-Hung Hu, Alexander Chao-Fu Kang, Hung-Shin Lee, Pin-Yuan Chen, Yu Tsao, Hsin-Min Wang:
The Academia Sinica Systems of Voice Conversion for VCC2020. CoRR abs/2010.02669 (2020) - [i26]Chung-En Sun, Yi-Wei Chen, Hung-Shin Lee, Yen-Hsing Chen, Hsin-Min Wang:
Melody Harmonization Using Orderless NADE, Chord Balancing, and Blocked Gibbs Sampling. CoRR abs/2010.13468 (2020) - [i25]Ryandhimas E. Zezario, Szu-Wei Fu, Chiou-Shann Fuh, Yu Tsao, Hsin-Min Wang:
STOI-Net: A Deep Learning based Non-Intrusive Speech Intelligibility Assessment Model. CoRR abs/2011.04292 (2020) - [i24]Ryandhimas E. Zezario, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao:
Speech Enhancement with Zero-Shot Model Selection. CoRR abs/2012.09359 (2020)
2010 – 2019
- 2019
- [c198]Hsiao-Tzu Hung, Chung-Yang Wang, Yi-Hsuan Yang, Hsin-Min Wang:
Improving Automatic Jazz Melody Generation by Transfer Learning Techniques. APSIPA 2019: 339-346 - [c197]Tassadaq Hussain, Yu Tsao, Hsin-Min Wang, Jia-Ching Wang, Sabato Marco Siniscalchi, Wen-Hung Liao:
Compressed Multimodal Hierarchical Extreme Learning Machine for Speech Enhancement. APSIPA 2019: 678-683 - [c196]Qian-Bei Hong, Chung-Hsien Wu, Ming-Hsiang Su, Hsin-Min Wang:
Sequential Speaker Embedding and Transfer Learning for Text-Independent Speaker Identification. APSIPA 2019: 827-832 - [c195]Yueh-Ting Lee, Xuan-Bo Chen, Hung-Shin Lee, Jyh-Shing Roger Jang, Hsin-Min Wang:
Multi-task Learning for Acoustic Modeling Using Articulatory Attributes. APSIPA 2019: 855-861 - [c194]Wei-Cheng Lin, Yu Tsao, Fei Chen, Hsin-Min Wang:
Investigation of Neural Network Approaches for Unified Spectral and Prosodic Feature Enhancement. APSIPA 2019: 1179-1184 - [c193]Shang-Bao Luo, Hung-Shin Lee, Kuan-Yu Chen, Hsin-Min Wang:
Spoken Multiple-Choice Question Answering Using Multimodal Convolutional Neural Networks. ASRU 2019: 772-778 - [c192]Wen-Chin Huang, Yi-Chiao Wu, Hsin-Te Hwang, Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda, Yu Tsao, Hsin-Min Wang:
Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion. EUSIPCO 2019: 1-5 - [c191]Tassadaq Hussain, Yu Tsao, Hsin-Min Wang, Jia-Ching Wang, Sabato Marco Siniscalchi, Wen-Hung Liao:
Audio-Visual Speech Enhancement using Hierarchical Extreme Learning Machine. EUSIPCO 2019: 1-5 - [c190]Yih-Liang Shen, Chao-Yuan Huang, Syu-Siang Wang, Yu Tsao, Hsin-Min Wang, Tai-Shih Chi:
Reinforcement Learning Based Speech Enhancement for Robust Speech Recognition. ICASSP 2019: 6750-6754 - [c189]Wen-Chin Huang, Yi-Chiao Wu, Chen-Chou Lo, Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda, Yu Tsao, Hsin-Min Wang:
Investigation of F0 Conditioning and Fully Convolutional Networks in Variational Autoencoder Based Voice Conversion. INTERSPEECH 2019: 709-713 - [c188]Chen-Chou Lo, Szu-Wei Fu, Wen-Chin Huang, Xin Wang, Junichi Yamagishi, Yu Tsao, Hsin-Min Wang:
MOSNet: Deep Learning-Based Objective Assessment for Voice Conversion. INTERSPEECH 2019: 1541-1545 - [c187]Pin-Tuan Huang, Hung-Shin Lee, Syu-Siang Wang, Kuan-Yu Chen, Yu Tsao, Hsin-Min Wang:
Exploring the Encoder Layers of Discriminative Autoencoders for LVCSR. INTERSPEECH 2019: 1631-1635 - [c186]Chien-Feng Liao, Yu Tsao, Hung-yi Lee, Hsin-Min Wang:
Noise Adaptive Speech Enhancement Using Domain Adversarial Training. INTERSPEECH 2019: 3148-3152 - [c185]Ryandhimas E. Zezario, Szu-Wei Fu, Xugang Lu, Hsin-Min Wang, Yu Tsao:
Specialized Speech Enhancement Model Selection Based on Learned Non-Intrusive Quality Assessment Metric. INTERSPEECH 2019: 3168-3172 - [c184]Tassadaq Hussain, Yu Tsao, Sabato Marco Siniscalchi, Jia-Ching Wang, Hsin-Min Wang, Wen-Hung Liao:
Bone-Conducted Speech Enhancement Using Hierarchical Extreme Learning Machine. IWSDS 2019: 153-162 - [c183]Sin-Horng Chen, Hsin-Min Wang:
Oriental COCOSDA - country report 2019 language resources developed in Taiwan. O-COCOSDA 2019: 1-6 - [c182]Kuan-Yi Kang, Yi-Wen Liu, Hsin-Min Wang:
Influences of Prosodic Feature Replacement on the Perceived Singing Voice Identity. ROCLING 2019: 296-309 - [c181]Wen-Chin Huang, Yi-Chiao Wu, Kazuhiro Kobayashi, Yu-Huai Peng, Hsin-Te Hwang, Patrick Lumban Tobing, Yu Tsao, Hsin-Min Wang, Tomoki Toda:
Generalization of Spectrum Differential based Direct Waveform Modification for Voice Conversion. SSW 2019: 57-62 - [i23]Chen-Chou Lo, Szu-Wei Fu, Wen-Chin Huang, Xin Wang, Junichi Yamagishi, Yu Tsao, Hsin-Min Wang:
MOSNet: Deep Learning based Objective Assessment for Voice Conversion. CoRR abs/1904.08352 (2019) - [i22]Wen-Chin Huang, Yi-Chiao Wu, Chen-Chou Lo, Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda, Yu Tsao, Hsin-Min Wang:
Investigation of F0 conditioning and Fully Convolutional Networks in Variational Autoencoder based Voice Conversion. CoRR abs/1905.00615 (2019) - [i21]Hsiao-Tzu Hung, Chung-Yang Wang, Yi-Hsuan Yang, Hsin-Min Wang:
Improving Automatic Jazz Melody Generation by Transfer Learning Techniques. CoRR abs/1908.09484 (2019) - [i20]Chang-Le Liu, Szu-Wei Fu, You-Jin Lee, Yu Tsao, Jen-Wei Huang, Hsin-Min Wang:
Multichannel Speech Enhancement by Raw Waveform-mapping using Fully Convolutional Networks. CoRR abs/1909.11909 (2019) - [i19]Natalie Yu-Hsien Wang, Hsiao-Lan Sharon Wang, Taowei Wang, Szu-Wei Fu, Xugang Lu, Yu Tsao, Hsin-Min Wang:
Improving the Intelligibility of Electric and Acoustic Stimulation Speech Using Fully Convolutional Networks Based Speech Enhancement. CoRR abs/1909.11912 (2019) - [i18]Xin Wang, Junichi Yamagishi, Massimiliano Todisco, Héctor Delgado, Andreas Nautsch, Nicholas W. D. Evans, Md. Sahidullah, Ville Vestman, Tomi Kinnunen, Kong Aik Lee, Lauri Juvela, Paavo Alku, Yu-Huai Peng, Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Sébastien Le Maguer, Markus Becker, Fergus Henderson, Rob Clark, Yu Zhang, Quan Wang, Ye Jia, Kai Onuma, Koji Mushika, Takashi Kaneda, Yuan Jiang, Li-Juan Liu, Yi-Chiao Wu, Wen-Chin Huang, Tomoki Toda, Kou Tanaka, Hirokazu Kameoka, Ingmar Steiner, Driss Matrouf, Jean-François Bonastre, Avashna Govender, Srikanth Ronanki, Jing-Xuan Zhang, Zhen-Hua Ling:
The ASVspoof 2019 database. CoRR abs/1911.01601 (2019) - [i17]Syu-Siang Wang, Yu-You Liang, Jeih-weih Hung, Yu Tsao, Hsin-Min Wang, Shih-Hau Fang:
Distributed Microphone Speech Enhancement based on Deep Learning. CoRR abs/1911.08153 (2019) - [i16]Yu-Tao Chang, Yuan-Hong Yang, Yu-Huai Peng, Syu-Siang Wang, Tai-Shih Chi, Yu Tsao, Hsin-Min Wang:
MoEVC: A Mixture-of-experts Voice Conversion System with Sparse Gating Mechanism for Accelerating Online Computation. CoRR abs/1912.11984 (2019) - 2018
- [j60]Hsin-Te Hwang, Yi-Chiao Wu, Syu-Siang Wang, Chin-Cheng Hsu, Yu Tsao, Hsin-Min Wang, Yih-Ru Wang, Sin-Horng Chen:
Locally Linear Embedding Based Post-Filtering for Speech Enhancement. J. Inf. Sci. Eng. 34(6): 1469-1491 (2018) - [j59]Hsin-Te Hwang, Yi-Chiao Wu, Yu-Huai Peng, Chin-Cheng Hsu, Yu Tsao, Hsin-Min Wang, Yih-Ru Wang, Sin-Horng Chen:
Voice Conversion Based on Locally Linear Embedding. J. Inf. Sci. Eng. 34(6): 1493-1516 (2018) - [j58]Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang:
An Information Distillation Framework for Extractive Summarization. IEEE ACM Trans. Audio Speech Lang. Process. 26(1): 161-170 (2018) - [j57]Jen-Cheng Hou, Syu-Siang Wang, Ying-Hui Lai, Yu Tsao, Hsiu-Wen Chang, Hsin-Min Wang:
Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks. IEEE Trans. Emerg. Top. Comput. Intell. 2(2): 117-128 (2018) - [j56]Jen-Chun Lin, Wen-Li Wei, Tyng-Luh Liu, Yi-Hsuan Yang, Hsin-Min Wang, Hsiao-Rong Tyan, Hong-Yuan Mark Liao:
Coherent Deep-Net Fusion To Classify Shots In Concert Videos. IEEE Trans. Multim. 20(11): 3123-3136 (2018) - [c180]Ryandhimas E. Zezario, Jen-Wei Huang, Xugang Lu, Yu Tsao, Hsin-Te Hwang, Hsin-Min Wang:
Deep Denoising Autoencoder Based Post Filtering for Speech Enhancement. APSIPA 2018: 373-377 - [c179]Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang:
Essence Vector-Based Query Modeling for Spoken Document Retrieval. ICASSP 2018: 6274-6278 - [c178]Wen-Li Wei, Jen-Chun Lin, Tyng-Luh Liu, Yi-Hsuan Yang, Hsin-Min Wang, Hsiao-Rong Tyan, Hong-Yuan Mark Liao:
Seethevoice: Learning from Music to Visual Storytelling of Shots. ICME 2018: 1-6 - [c177]Yu-Huai Peng, Hsin-Te Hwang, Yi-Chiao Wu, Yu Tsao, Hsin-Min Wang:
Exemplar-Based Spectral Detail Compensation for Voice Conversion. INTERSPEECH 2018: 486-490 - [c176]Szu-Wei Fu, Yu Tsao, Hsin-Te Hwang, Hsin-Min Wang:
Quality-Net: An End-to-End Non-intrusive Speech Quality Assessment Model Based on BLSTM. INTERSPEECH 2018: 1873-1877 - [c175]Wen-Chin Huang, Hsin-Te Hwang, Yu-Huai Peng, Yu Tsao, Hsin-Min Wang:
Voice Conversion Based on Cross-Domain Features Using Variational Auto Encoders. ISCSLP 2018: 51-55 - [c174]Yi-Ying Kao, Hsiang-Ping Hsu, Chien-Feng Liao, Yu Tsao, Hao-Chun Yang, Jeng-Lin Li, Chi-Chun Lee, Hung-Shin Lee, Hsin-Min Wang:
Automatic Detection of Speech Under Cold Using Discriminative Autoencoders and Strength Modeling with Multiple Sub-Dictionary Generation. IWAENC 2018: 416-420 - [c173]Wen-Chin Huang, Chen-Chou Lo, Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang:
WaveNet 聲碼器及其於語音轉換之應用 (WaveNet Vocoder and its Applications in Voice Conversion) [In Chinese]. ROCLING 2018: 96-110 - [i15]Chien-Feng Liao, Yu Tsao, Hung-yi Lee, Hsin-Min Wang:
Noise Adaptive Speech Enhancement using Domain Adversarial Training. CoRR abs/1807.07501 (2018) - [i14]Szu-Wei Fu, Yu Tsao, Hsin-Te Hwang, Hsin-Min Wang:
Quality-Net: An End-to-End Non-intrusive Speech Quality Assessment Model based on BLSTM. CoRR abs/1808.05344 (2018) - [i13]Wen-Chin Huang, Hsin-Te Hwang, Yu-Huai Peng, Yu Tsao, Hsin-Min Wang:
Voice Conversion Based on Cross-Domain Features Using Variational Auto Encoders. CoRR abs/1808.09634 (2018) - [i12]Yih-Liang Shen, Chao-Yuan Huang, Syu-Siang Wang, Yu Tsao, Hsin-Min Wang, Tai-Shih Chi:
Reinforcement Learning Based Speech Enhancement for Robust Speech Recognition. CoRR abs/1811.04224 (2018) - [i11]Wen-Chin Huang, Yi-Chiao Wu, Hsin-Te Hwang, Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda, Yu Tsao, Hsin-Min Wang:
Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion. CoRR abs/1811.11078 (2018) - 2017
- [j55]Shih-Hung Liu, Kuan-Yu Chen, Kai-Wun Shih, Berlin Chen, Hsin-Min Wang, Wen-Lian Hsu:
An Empirical Comparison of Contemporary Unsupervised Approaches for Extractive Speech Summarization. Int. J. Comput. Linguistics Chin. Lang. Process. 22(1) (2017) - [j54]Tien-Hong Lo, Ying-Wen Chen, Kuan-Yu Chen, Hsin-Min Wang, Berlin Chen:
On the Use of Neural Network Modeling Techniques for Spoken Document Retrieval. Int. J. Comput. Linguistics Chin. Lang. Process. 22(2) (2017) - [j53]Chia-Lung Wu, Hsiang-Ping Hsu, Yu-Ding Lu, Yu Tsao, Hung-Shin Lee, Hsin-Min Wang:
A Replay Spoofing Detection System Based on Discriminative Autoencoders. Int. J. Comput. Linguistics Chin. Lang. Process. 22(2) (2017) - [j52]Shih-Hung Liu, Kuan-Yu Chen, Yu-Lun Hsieh, Berlin Chen, Hsin-Min Wang, Hsu-Chun Yen, Wen-Lian Hsu:
A Position-Aware Language Modeling Framework for Extractive Broadcast News Speech Summarization. ACM Trans. Asian Low Resour. Lang. Inf. Process. 16(4): 27:1-27:13 (2017) - [c172]Yu-Huai Peng, Chin-Cheng Hsu, Yi-Chiao Wu, Hsin-Te Hwang, Yi-Wen Liu, Yu Tsao, Hsin-Min Wang:
Fast locally linear embedding algorithm for exemplar-based voice conversion. APSIPA 2017: 591-595 - [c171]Ming-Hsiang Su, Chung-Hsien Wu, Kun-Yi Huang, Qian-Bei Hong, Hsin-Min Wang:
Personality trait perception from speech signals using multiresolution analysis and convolutional neural networks. APSIPA 2017: 1532-1536 - [c170]Tien-Hong Lo, Ying-Wen Chen, Kuan-Yu Chen, Hsin-Min Wang, Berlin Chen:
Neural relevance-aware query modeling for spoken document retrieval. ASRU 2017: 466-473 - [c169]Wen-Li Wei, Jen-Chun Lin, Tyng-Luh Liu, Yi-Hsuan Yang, Hsin-Min Wang, Hsiao-Rong Tyan, Hong-Yuan Mark Liao:
Deep-net fusion to classify shots in concert videos. ICASSP 2017: 1383-1387 - [c168]Po-Yuan Shih, Chia-Ping Chen, Hsin-Min Wang:
Speech emotion recognition with skew-robust neural networks. ICASSP 2017: 2751-2755 - [c167]Hung-Shin Lee, Yu-Ding Lu, Chin-Cheng Hsu, Yu Tsao, Hsin-Min Wang, Shyh-Kang Jeng:
Discriminative autoencoders for speaker verification. ICASSP 2017: 5375-5379 - [c166]Yi-Chiao Wu, Hsin-Te Hwang, Syu-Siang Wang, Chin-Cheng Hsu, Ying-Hui Lai, Yu Tsao, Hsin-Min Wang:
A locally linear embbeding based postfiltering approach for speech enhancement. ICASSP 2017: 5555-5559 - [c165]Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang:
A locality-preserving essence vector modeling framework for spoken document retrieval. ICASSP 2017: 5665-5669 - [c164]Shih-Hung Liu, Kuan-Yu Chen, Berlin Chen, Hsin-Min Wang, Wen-Lian Hsu:
Leveraging manifold learning for extractive broadcast news summarization. ICASSP 2017: 5805-5809 - [c163]Chia-Lung Wu, Hsiang-Ping Hsu, Syu-Siang Wang, Jeih-Weih Hung, Ying-Hui Lai, Hsin-Min Wang, Yu Tsao:
Wavelet Speech Enhancement Based on Robust Principal Component Analysis. INTERSPEECH 2017: 439-443 - [c162]Yi-Chiao Wu, Hsin-Te Hwang, Syu-Siang Wang, Chin-Cheng Hsu, Yu Tsao, Hsin-Min Wang:
A Post-Filtering Approach Based on Locally Linear Embedding Difference Compensation for Speech Enhancement. INTERSPEECH 2017: 1953-1957 - [c161]Ying-Wen Chen, Kuan-Yu Chen, Hsin-Min Wang, Berlin Chen:
Exploring the Use of Significant Words Language Modeling for Spoken Document Retrieval. INTERSPEECH 2017: 2889-2893 - [c160]Chin-Cheng Hsu, Hsin-Te Hwang, Yi-Chiao Wu, Yu Tsao, Hsin-Min Wang:
Voice Conversion from Unaligned Corpora Using Variational Autoencoding Wasserstein Generative Adversarial Networks. INTERSPEECH 2017: 3364-3368 - [c159]Ming-Han Yang, Hung-Shin Lee, Yu-Ding Lu, Kuan-Yu Chen, Yu Tsao, Berlin Chen, Hsin-Min Wang:
Discriminative Autoencoders for Acoustic Modeling. INTERSPEECH 2017: 3557-3561 - [c158]Jen-Chun Lin, Wen-Li Wei, James Yang, Hsin-Min Wang, Hong-Yuan Mark Liao:
Automatic Music Video Generation Based on Simultaneous Soundtrack Recommendation and Video Editing. ACM Multimedia 2017: 519-527 - [c157]Yu-Ding Lu, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang:
基於鑑別式自編碼解碼器之錄音回放攻擊偵測系統 (A Replay Spoofing Detection System Based on Discriminative Autoencoders) [In Chinese]. ROCLING 2017: 114-115 - [c156]Cheng-Jo Ray Chang, Hung-Shin Lee, Hsin-Min Wang, Jyh-Shing Roger Jang:
基於i-vector與PLDA並使用GMM-HMM強制對位之自動語者分段標記系統 (Speaker Diarization based on I-vector PLDA Scoring and using GMM-HMM Forced Alignment) [In Chinese]. ROCLING 2017: 119-135 - [c155]Tien-Hong Lo, Ying-Wen Chen, Berlin Chen, Kuan-Yu Chen, Hsin-Min Wang:
使用查詢意向探索與類神經網路於語音文件檢索之研究 (Exploring Query Intent and Neural Network modeling Techniques for Spoken Document Retrieval) [In Chinese]. ROCLING 2017: 149-151 - [p1]Ju-Chiang Wang, Yi-Hsuan Yang, Hsin-Min Wang:
Affective Music Information Retrieval. Emotions and Personality in Personalized Services 2017: 227-261 - [i10]Jen-Cheng Hou, Syu-Siang Wang, Ying-Hui Lai, Jen-Chun Lin, Yu Tsao, Hsiu-Wen Chang, Hsin-Min Wang:
Audio-Visual Speech Enhancement based on Multimodal Deep Convolutional Neural Network. CoRR abs/1703.10893 (2017) - [i9]Chin-Cheng Hsu, Hsin-Te Hwang, Yi-Chiao Wu, Yu Tsao, Hsin-Min Wang:
Voice Conversion from Unaligned Corpora using Variational Autoencoding Wasserstein Generative Adversarial Networks. CoRR abs/1704.00849 (2017) - 2016
- [j51]Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang, Hsin-Hsi Chen:
Exploring the use of unsupervised query modeling techniques for speech recognition and summarization. Speech Commun. 80: 49-59 (2016) - [j50]Yu-Ren Chien, Hsin-Min Wang, Shyh-Kang Jeng:
Alignment of Lyrics With Accompanied Singing Audio Based on Acoustic-Phonetic Vowel Likelihood Modeling. IEEE ACM Trans. Audio Speech Lang. Process. 24(11): 1998-2008 (2016) - [c154]Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang:
A novel paragraph embedding method for spoken document summarization. APSIPA 2016: 1-6 - [c153]Jen-Cheng Hou, Syu-Siang Wang, Ying-Hui Lai, Jen-Chun Lin, Yu Tsao, Hsiu-Wen Chang, Hsin-Min Wang:
Audio-visual speech enhancement using deep neural networks. APSIPA 2016: 1-6 - [c152]Chin-Cheng Hsu, Hsin-Te Hwang, Yi-Chiao Wu, Yu Tsao, Hsin-Min Wang:
Voice conversion from non-parallel corpora using variational auto-encoder. APSIPA 2016: 1-6 - [c151]Shih-Hung Liu, Kuan-Yu Chen, Yu-Lun Hsieh, Berlin Chen, Hsin-Min Wang, Hsu-Chun Yen, Wen-Lian Hsu:
Exploiting graph regularized nonnegative matrix factorization for extractive speech summarization. APSIPA 2016: 1-7 - [c150]Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang:
Learning to Distill: The Essence Vector Modeling Framework. COLING 2016: 358-368 - [c149]Jen-Chun Lin, Wen-Li Wei, Hsin-Min Wang:
DEMV-matchmaker: Emotional temporal course representation and deep similarity matching for automatic music video generation. ICASSP 2016: 2772-2776 - [c148]Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang:
Improved spoken document summarization with coverage modeling techniques. ICASSP 2016: 6010-6014 - [c147]Shih-Hung Liu, Kuan-Yu Chen, Yu-Lun Hsieh, Berlin Chen, Hsin-Min Wang, Hsu-Chun Yen, Wen-Lian Hsu:
Exploring Word Mover's Distance and Semantic-Aware Embedding Techniques for Extractive Broadcast News Summarization. INTERSPEECH 2016: 670-674 - [c146]Yi-Chiao Wu, Hsin-Te Hwang, Chin-Cheng Hsu, Yu Tsao, Hsin-Min Wang:
Locally Linear Embedding for Exemplar-Based Spectral Conversion. INTERSPEECH 2016: 1652-1656 - [c145]Hung-Shin Lee, Yu Tsao, Chi-Chun Lee, Hsin-Min Wang, Wei-Cheng Lin, Wei-Chen Chen, Shan-Wen Hsiao, Shyh-Kang Jeng:
Minimization of Regression and Ranking Losses with Shallow Neural Networks on Automatic Sincerity Evaluation. INTERSPEECH 2016: 2031-2035 - [c144]Chin-Cheng Hsu, Hsin-Te Hwang, Yi-Chiao Wu, Yu Tsao, Hsin-Min Wang:
Dictionary update for NMF-based voice conversion using an encoder-decoder network. ISCSLP 2016: 1-5 - [c143]Jen-Chun Lin, Wen-Li Wei, Hsin-Min Wang:
Automatic Music Video Generation Based on Emotion-Oriented Pseudo Song Prediction and Matching. ACM Multimedia 2016: 372-376 - [c142]Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang, Hsin-Hsi Chen:
Novel Word Embedding and Translation-based Language Modeling for Extractive Speech Summarization. ACM Multimedia 2016: 377-381 - [c141]Yu-Lun Hsieh, Shih-Hung Liu, Kuan-Yu Chen, Hsin-Min Wang, Wen-Lian Hsu, Berlin Chen:
運用序列到序列生成架構於重寫式自動摘要(Exploiting Sequence-to-Sequence Generation Framework for Automatic Abstractive Summarization)[In Chinese]. ROCLING 2016 - [i8]Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang:
Improved Spoken Document Summarization with Coverage Modeling Techniques. CoRR abs/1601.05194 (2016) - [i7]Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang, Hsin-Hsi Chen:
Novel Word Embedding and Translation-based Language Modeling for Extractive Speech Summarization. CoRR abs/1607.06532 (2016) - [i6]Chin-Cheng Hsu, Hsin-Te Hwang, Yi-Chiao Wu, Yu Tsao, Hsin-Min Wang:
Dictionary Update for NMF-based Voice Conversion Using an Encoder-Decoder Network. CoRR abs/1610.03988 (2016) - [i5]Chin-Cheng Hsu, Hsin-Te Hwang, Yi-Chiao Wu, Yu Tsao, Hsin-Min Wang:
Voice Conversion from Non-parallel Corpora Using Variational Auto-encoder. CoRR abs/1610.04019 (2016) - [i4]Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang:
Learning to Distill: The Essence Vector Modeling Framework. CoRR abs/1611.07206 (2016) - 2015
- [j49]Ting-Hao Chang, Hsiao-Tsung Hung, Kuan-Yu Chen, Hsin-Min Wang, Berlin Chen:
Investigating Modulation Spectrum Factorization Techniques for Robust Speech Recognition. Int. J. Comput. Linguistics Chin. Lang. Process. 20(2) (2015) - [j48]Kai-Wun Shih, Kuan-Yu Chen, Shih-Hung Liu, Hsin-Min Wang, Berlin Chen:
Extractive Spoken Document Summarization with Representation Learning Techniques. Int. J. Comput. Linguistics Chin. Lang. Process. 20(2) (2015) - [j47]Ju-Chiang Wang, Yi-Hsuan Yang, Hsin-Min Wang, Shyh-Kang Jeng:
Modeling the Affective Content of Music with a Gaussian Mixture Model. IEEE Trans. Affect. Comput. 6(1): 56-68 (2015) - [j46]Kuan-Yu Chen, Hsin-Min Wang, Hsin-Hsi Chen:
A Probabilistic Framework for Chinese Spelling Check. ACM Trans. Asian Low Resour. Lang. Inf. Process. 14(4): 15:1-15:17 (2015) - [j45]Shih-Hung Liu, Kuan-Yu Chen, Berlin Chen, Hsin-Min Wang, Hsu-Chun Yen, Wen-Lian Hsu:
Combining Relevance Language Modeling and Clarity Measure for Extractive Speech Summarization. IEEE ACM Trans. Audio Speech Lang. Process. 23(6): 957-969 (2015) - [j44]Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang, Ea-Ee Jan, Wen-Lian Hsu, Hsin-Hsi Chen:
Extractive Broadcast News Summarization Leveraging Recurrent Neural Network Language Modeling Techniques. IEEE ACM Trans. Audio Speech Lang. Process. 23(8): 1322-1334 (2015) - [j43]Yu-Ren Chien, Hsin-Min Wang, Shyh-Kang Jeng:
An Acoustic-Phonetic Model of F0 Likelihood for Vocal Melody Extraction. IEEE ACM Trans. Audio Speech Lang. Process. 23(9): 1457-1468 (2015) - [c140]Syu-Siang Wang, Hsin-Te Hwang, Ying-Hui Lai, Yu Tsao, Xugang Lu, Hsin-Min Wang, Borching Su:
Improving denoising auto-encoder based speech enhancement with the speech parameter generation algorithm. APSIPA 2015: 365-369 - [c139]Shih-Hung Liu, Hung-Shin Lee, Hsiao-Tsung Hung, Kuan-Yu Chen, Berlin Chen, Hsin-Min Wang, Hsu-Chun Yen, Wen-Lian Hsu:
Incorporating proximity information in relevance language modeling for extractive speech summarization. APSIPA 2015: 401-407 - [c138]Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Yih-Ru Wang, Sin-Horng Chen:
A probabilistic interpretation for artificial neural network-based voice conversion. APSIPA 2015: 552-558 - [c137]Kuan-Yu Chen, Kai-Wun Shih, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang:
Incorporating paragraph embeddings and density peaks clustering for spoken document summarization. ASRU 2015: 207-214 - [c136]Ju-Chiang Wang, Hsin-Min Wang, Gert R. G. Lanckriet:
A histogram density modeling approach to music emotion recognition. ICASSP 2015: 698-702 - [c135]Kuan-Yu Chen, Hsin-Min Wang, Berlin Chen, Hsin-Hsi Chen:
I-vector based language modeling for query representation. ICASSP 2015: 5211-5215 - [c134]Kuan-Yu Chen, Shih-Hung Liu, Hsin-Min Wang, Berlin Chen, Hsin-Hsi Chen:
Leveraging word embeddings for spoken document summarization. INTERSPEECH 2015: 1383-1387 - [c133]Shih-Hung Liu, Kuan-Yu Chen, Berlin Chen, Hsin-Min Wang, Hsu-Chun Yen, Wen-Lian Hsu:
Positional language modeling for extractive broadcast news speech summarization. INTERSPEECH 2015: 2729-2733 - [c132]Jen-Chun Lin, Wen-Li Wei, Hsin-Min Wang:
EMV-matchmaker: Emotional Temporal Course Modeling and Matching for Automatic Music Video Generation. ACM Multimedia 2015: 899-902 - [c131]Ting-Hao Chang, Hsiao-Tsung Hung, Kuan-Yu Chen, Hsin-Min Wang, Berlin Chen:
調變頻譜分解之改良於強健性語音辨識(Several Refinements of Modulation Spectrum Factorization for Robust Speech Recognition) [In Chinese]. ROCLING 2015 - [c130]Kai-Wun Shih, Berlin Chen, Kuan-Yu Chen, Shih-Hung Liu, Hsin-Min Wang:
表示法學習技術於節錄式語音文件摘要之研究(A Study on Representation Learning Techniques for Extractive Spoken Document Summarization) [In Chinese]. ROCLING 2015 - [e2]Sin-Horng Chen, Hsin-Min Wang, Jen-Tzung Chien, Hung-Yu Kao, Wen-Whei Chang, Yih-Ru Wang, Shih-Hung Wu:
Proceedings of the 27th Conference on Computational Linguistics and Speech Processing, ROCLING 2015, National Chiao Tung University, Hsinchu, Taiwan, October 1-2, 2015. Association for Computational Linguistics and Chinese Language Processing (ACLCLP), Taiwan 2015, ISBN 978-957-30792-8-6 [contents] - [i3]Ju-Chiang Wang, Hung-Yan Gu, Hsin-Min Wang:
Mandarin Singing Voice Synthesis Based on Harmonic Plus Noise Model and Singing Expression Analysis. CoRR abs/1502.04300 (2015) - [i2]Ju-Chiang Wang, Yi-Hsuan Yang, Hsin-Min Wang:
Affective Music Information Retrieval. CoRR abs/1502.05131 (2015) - [i1]Kuan-Yu Chen, Shih-Hung Liu, Hsin-Min Wang, Berlin Chen, Hsin-Hsi Chen:
Leveraging Word Embeddings for Spoken Document Summarization. CoRR abs/1506.04365 (2015) - 2014
- [j42]Berlin Chen, Yi-Wen Chen, Kuan-Yu Chen, Hsin-Min Wang, Kuen-Tyng Yu:
Enhancing Query Formulation for Spoken Document Retrieval. J. Inf. Sci. Eng. 30(3): 553-569 (2014) - [j41]Hung-Yi Lo, Shou-De Lin, Hsin-Min Wang:
Generalized k-Labelsets Ensemble for Multi-Label and Cost-Sensitive Classification. IEEE Trans. Knowl. Data Eng. 26(7): 1679-1691 (2014) - [c129]Jen-Chun Lin, Wen-Li Wei, Chung-Hsien Wu, Hsin-Min Wang:
Emotion recognition of conversational affective speech using temporal course modeling-based error weighted cross-correlation model. APSIPA 2014: 1-7 - [c128]Shih-Hung Liu, Kuan-Yu Chen, Berlin Chen, Ea-Ee Jan, Hsin-Min Wang, Hsu-Chun Yen, Wen-Lian Hsu:
A margin-based discriminative modeling approach for extractive speech summarization. APSIPA 2014: 1-6 - [c127]Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Ea-Ee Jan, Hsin-Min Wang, Wen-Lian Hsu, Hsin-Hsi Chen:
Leveraging Effective Query Modeling Techniques for Speech Recognition and Summarization. EMNLP 2014: 1474-1480 - [c126]Hung-Shin Lee, Yu Tso, Yun-Fan Chang, Hsin-Min Wang, Shyh-Kang Jeng:
Speaker verification using kernel-based binary classifiers with binary operation derived features. ICASSP 2014: 1660-1664 - [c125]Chin-Chia Michael Yeh, Ju-Chiang Wang, Yi-Hsuan Yang, Hsin-Min Wang:
Improving music auto-tagging by intra-song instance bagging. ICASSP 2014: 2139-2143 - [c124]Shih-Hung Liu, Kuan-Yu Chen, Yu-Lun Hsieh, Berlin Chen, Hsin-Min Wang, Hsu-Chun Yen, Wen-Lian Hsu:
Effective pseudo-relevance feedback for language modeling in extractive speech summarization. ICASSP 2014: 3226-3230 - [c123]Kuan-Yu Chen, Hung-Shin Lee, Hsin-Min Wang, Berlin Chen, Hsin-Hsi Chen:
I-vector based language modeling for spoken document retrieval. ICASSP 2014: 7083-7088 - [c122]Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang, Wen-Lian Hsu, Hsin-Hsi Chen:
A recurrent neural network language modeling framework for extractive speech summarization. ICME 2014: 1-6 - [c121]Shuo-Yang Wang, Ju-Chiang Wang, Yi-Hsuan Yang, Hsin-Min Wang:
Towards time-varying music auto-tagging based on CAL500 expansion. ICME 2014: 1-6 - [c120]How Jing, Ting-Yao Hu, Hung-Shin Lee, Wei-Chen Chen, Chi-Chun Lee, Yu Tsao, Hsin-Min Wang:
Ensemble of machine learning algorithms for cognitive and physical speaker load detection. INTERSPEECH 2014: 447-451 - [c119]Hung-Shin Lee, Yu Tsao, Hsin-Min Wang, Shyh-Kang Jeng:
Clustering-based i-vector formulation for speaker recognition. INTERSPEECH 2014: 1101-1105 - [c118]Shih-Hung Liu, Kuan-Yu Chen, Yu-Lun Hsieh, Berlin Chen, Hsin-Min Wang, Hsu-Chun Yen, Wen-Lian Hsu:
Enhanced language modeling for extractive speech summarization with sentence relatedness information. INTERSPEECH 2014: 1865-1869 - [c117]Ju-Chiang Wang, Ming-Chi Yen, Yi-Hsuan Yang, Hsin-Min Wang:
Automatic Set List Identification and Song Segmentation for Full-Length Concert Videos. ISMIR 2014: 239-244 - [c116]Shih-Hung Liu, Kuan-Yu Chen, Yu-Lun Hsieh, Berlin Chen, Hsin-Min Wang, Wen-Lian Hsu:
探究新穎語句模型化技術於節錄式語音摘要 (Investigating Novel Sentence Modeling Techniques for Extractive Speech Summarization) [In Chinese]. ROCLING 2014 - [e1]Hsin-Min Wang, Yi-Hsuan Yang, Jin Ha Lee:
Proceedings of the 15th International Society for Music Information Retrieval Conference, ISMIR 2014, Taipei, Taiwan, October 27-31, 2014. 2014 [contents] - 2013
- [c115]Kuan-Yu Chen, Hung-Shin Lee, Chung-Han Lee, Hsin-Min Wang, Hsin-Hsi Chen:
A Study of Language Modeling for Chinese Spelling Check. SIGHAN@IJCNLP 2013: 79-83 - [c114]Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Yih-Ru Wang, Sin-Horng Chen:
Incorporating global variance in the training phase of GMM-based voice conversion. APSIPA 2013: 1-6 - [c113]Hung-Shin Lee, Yu-Chin Shih, Hsin-Min Wang, Shyh-Kang Jeng:
Subspace-based phonotactic language recognition using multivariate dynamic linear models. ICASSP 2013: 6870-6874 - [c112]Kuan-Yu Chen, Hsin-Min Wang, Berlin Chen, Hsin-Hsi Chen:
Weighted matrix factorization for spoken document retrieval. ICASSP 2013: 8530-8534 - [c111]Yi-Wen Chen, Kuan-Yu Chen, Hsin-Min Wang, Berlin Chen:
Effective pseudo-relevance feedback for spoken document retrieval. ICASSP 2013: 8535-8539 - [c110]How Jing, Yu Tsao, Kuan-Yu Chen, Hsin-Min Wang:
Semantic Naïve Bayes Classifier for Document Classification. IJCNLP 2013: 1117-1123 - [c109]Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Yih-Ru Wang, Sin-Horng Chen:
Alleviating the over-smoothing problem in GMM-based voice conversion with discriminative training. INTERSPEECH 2013: 3062-3066 - [c108]Zhonghua Li, Ju-Chiang Wang, Jingli Cai, Zhiyan Duan, Hsin-Min Wang, Ye Wang:
Non-reference audio quality assessment for online live music recordings. ACM Multimedia 2013: 63-72 - [c107]Meng-Sung Wu, Chia-Ping Chen, Hsin-Min Wang:
Query-Document Relevance Topic Models. PAKDD (2) 2013: 209-220 - [c106]Shih-Hung Liu, Kuan-Yu Chen, Hsin-Min Wang, Wen-Lian Hsu, Berlin Chen:
改良語句模型技術於節錄式語音摘要之研究 (Improved Sentence Modeling Techniques for Extractive Speech Summarization) [In Chinese]. ROCLING 2013 - 2012
- [j40]Kuan-Yu Chen, Hsin-Min Wang, Berlin Chen:
Spoken Document Retrieval Leveraging Unsupervised and Supervised Topic Modeling Techniques. IEICE Trans. Inf. Syst. 95-D(5): 1195-1205 (2012) - [c105]Ju-Chiang Wang, Yi-Hsuan Yang, Hsin-Min Wang, Shyh-Kang Jeng:
Personalized music emotion recognition via model adaptation. APSIPA 2012: 1-7 - [c104]Ju-Chiang Wang, Hsin-Min Wang, Shyh-Kang Jeng:
Playing with tagging: A real-time tagging music player. ICASSP 2012: 77-80 - [c103]Hung-Yi Lo, Shou-De Lin, Hsin-Min Wang:
Generalized k-labelset ensemble for multi-label classification. ICASSP 2012: 2061-2064 - [c102]Meng-Sung Wu, Hsin-Min Wang:
Term relevance dependency model for text classification. ICPR 2012: 1064-1067 - [c101]Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Yih-Ru Wang, Sin-Horng Chen:
A Study of Mutual Information for GMM-Based Spectral Conversion. INTERSPEECH 2012: 78-81 - [c100]Kuan-Yu Chen, Hao-Chin Chang, Berlin Chen, Hsin-Min Wang:
Word Relevance Modeling for Speech Recognition. INTERSPEECH 2012: 999-1002 - [c99]Yu-Chin Shih, Hung-Shin Lee, Hsin-Min Wang, Shyh-Kang Jeng:
Subspace-Based Feature Representation and Learning for Language Recognition. INTERSPEECH 2012: 2061-2064 - [c98]Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Yih-Ru Wang, Sin-Horng Chen:
Exploring mutual information for GMM-based spectral conversion. ISCSLP 2012: 50-54 - [c97]Ju-Chiang Wang, Yi-Hsuan Yang, Kaichun Chang, Hsin-Min Wang, Shyh-Kang Jeng:
Exploring the relationship between categorical and dimensional emotion semantics of music. MIRUM 2012: 63-68 - [c96]Ju-Chiang Wang, Yi-Hsuan Yang, Hsin-Min Wang, Shyh-Kang Jeng:
The acoustic emotion gaussians model for emotion-based music annotation and retrieval. ACM Multimedia 2012: 89-98 - [c95]Ju-Chiang Wang, Yi-Hsuan Yang, I-Hong Jhuo, Yen-Yu Lin, Hsin-Min Wang:
The acousticvisual emotion guassians model for automatic generation of music video. ACM Multimedia 2012: 1379-1380 - [c94]Meng-Sung Wu, Hsin-Min Wang:
A Term Association Translation Model for Naive Bayes Text Classification. PAKDD (1) 2012: 243-253 - 2011
- [j39]Hung-Yi Lo, Ju-Chiang Wang, Hsin-Min Wang, Shou-De Lin:
Cost-Sensitive Multi-Label Learning for Audio Tag Annotation and Retrieval. IEEE Trans. Multim. 13(3): 518-529 (2011) - [c93]Hung-Yi Lo, Ju-Chiang Wang, Hsin-Min Wang, Shou-De Lin:
Cost-sensitive stacking for audio tag annotation and retrieval. ICASSP 2011: 2308-2311 - [c92]Shih-Wei Sun, Yu-Chiang Frank Wang, Yao-Ling Hung, Chia-Ling Chang, Kuan-Chieh Chen, Shih-Sian Cheng, Hsin-Min Wang, Hong-Yuan Mark Liao:
Automatic annotation of Web videos. ICME 2011: 1-6 - [c91]Ju-Chiang Wang, Meng-Sung Wu, Hsin-Min Wang, Shyh-Kang Jeng:
Query by multi-tags with multi-level preferences for content-based music retrieval. ICME 2011: 1-6 - [c90]Yu-Ren Chien, Hsin-Min Wang, Shyh-Kang Jeng:
An Acoustic-Phonetic Approach to Vocal Melody Extraction. ISMIR 2011: 25-30 - [c89]Ju-Chiang Wang, Hung-Shin Lee, Hsin-Min Wang, Shyh-Kang Jeng:
Learning the Similarity of Audio Music in Bag-of-frames Representation from Tagged Music Data. ISMIR 2011: 85-90 - [c88]Ju-Chiang Wang, Yu-Chin Shih, Meng-Sung Wu, Hsin-Min Wang, Shyh-Kang Jeng:
Colorizing tags in tag cloud: a novel query-by-tag music search system. ACM Multimedia 2011: 293-302 - [c87]Hung-Yi Lo, Shou-De Lin, Hsin-Min Wang:
Audio Tag Annotation and Retrieval Using Tag Count Information. MMM (1) 2011: 339-349 - 2010
- [j38]Shih-Sian Cheng, Hsin-Min Wang, Hsin-Chia Fu:
BIC-Based Speaker Segmentation Using Divide-and-Conquer Strategies With Application to Speaker Diarization. IEEE Trans. Speech Audio Process. 18(1): 141-157 (2010) - [j37]Chih-Yi Chiu, Hsin-Min Wang:
Time-Series Linear Search for Video Copies Based on Compact Signature Manipulation and Containment Relation Modeling. IEEE Trans. Circuits Syst. Video Technol. 20(11): 1603-1613 (2010) - [j36]Chih-Yi Chiu, Hsin-Min Wang, Chu-Song Chen:
Fast min-hashing indexing and robust spatio-temporal matching for detecting video copies. ACM Trans. Multim. Comput. Commun. Appl. 6(2): 10:1-10:23 (2010) - [c86]Chih-Yi Chiu, Dimitrios Bountouridis, Ju-Chiang Wang, Hsin-Min Wang:
Background music identification through content filtering and min-hash matching. ICASSP 2010: 2414-2417 - [c85]Chih-Yi Chiu, Po-Chih Lin, Wei-Ming Chang, Hsin-Min Wang, Shi-Nine Yang:
Detecting pitching frames in baseball game video using Markov random walk. ICIP 2010: 1493-1496 - [c84]Hung-Yi Lo, Ju-Chiang Wang, Hsin-Min Wang:
Homogeneous segmentation and classifier ensemble for audio tag annotation and retrieval. ICME 2010: 304-309 - [c83]Hung-Shin Lee, Hsin-Min Wang, Berlin Chen:
A Discriminative and Heteroscedastic Linear Feature Transformation for Multiclass Classification. ICPR 2010: 690-693 - [c82]I-Fan Chen, Shih-Sian Cheng, Hsin-Min Wang:
Phonetic subspace mixture model for speaker diarization. INTERSPEECH 2010: 2298-2301 - [c81]Shih-Sian Cheng, I-Fan Chen, Hsin-Min Wang:
Bayesian speaker recognition using Gaussian mixture model and laplace approximation. INTERSPEECH 2010: 2730-2733 - [c80]Yi-Hsiang Chao, Wei-Ho Tsai, Hsin-Min Wang:
Speaker verification using support vector machine with LLR-based sequence kernels. ISCSLP 2010: 182-185 - [c79]Hung-Yi Lo, Hsin-Min Wang:
Phone boundary refinement using ranking methods. ISCSLP 2010: 488-492 - [c78]Meng-Sung Wu, Hung-Shin Lee, Hsin-Min Wang:
Exploiting semantic associative information in topic modeling. SLT 2010: 384-388
2000 – 2009
- 2009
- [j35]Wei-Ho Tsai, Hsin-Min Wang:
Evolutionary minimization of the Rand index for speaker clustering. Comput. Speech Lang. 23(2): 165-175 (2009) - [j34]Yi-Hsiang Chao, Wei-Ho Tsai, Hsin-Min Wang:
Improving GMM-UBM speaker verification using discriminative feedback adaptation. Comput. Speech Lang. 23(3): 376-388 (2009) - [j33]Hsin-Min Wang, Hidenori Taga:
Raman-Based 10.66 Gb/s Bidirectional TDM over Long-Reach WDM Hybrid PON. IEICE Trans. Commun. 92-B(12): 3911-3914 (2009) - [j32]Yi-Hsiang Chao, Wei-Ho Tsai, Hsin-Min Wang, Ruei-Chuan Chang:
Improving the characterization of the alternative hypothesis via minimum verification error training with applications to speaker verification. Pattern Recognit. 42(7): 1351-1360 (2009) - [j31]Shih-Hsiang Lin, Berlin Chen, Hsin-Min Wang:
A Comparative Study of Probabilistic Ranking Models for Chinese Spoken Document Summarization. ACM Trans. Asian Lang. Inf. Process. 8(1): 3:1-3:23 (2009) - [j30]Yi-Ting Chen, Berlin Chen, Hsin-Min Wang:
A Probabilistic Generative Framework for Extractive Broadcast News Speech Summarization. IEEE Trans. Speech Audio Process. 17(1): 95-106 (2009) - [j29]Shih-Sian Cheng, Hsin-Chia Fu, Hsin-Min Wang:
Model-Based Clustering by Probabilistic Self-Organizing Maps. IEEE Trans. Neural Networks 20(5): 805-826 (2009) - [c77]Jen-Wei Kuo, Pu-Jen Cheng, Hsin-Min Wang:
Learning to rank from Bayesian decision inference. CIKM 2009: 827-836 - [c76]Shih-Sian Cheng, Chun-Han Tseng, Chia-Ping Chen, Hsin-Min Wang:
Speaker diarization using divide-and-conquer. INTERSPEECH 2009: 1055-1058 - [c75]I-Fan Chen, Hsin-Min Wang:
Articulatory feature asynchrony analysis and compensation in detection-based ASR. INTERSPEECH 2009: 3059-3062 - [c74]Yow-Bang Wang, Hsin-Min Wang, Lin-Shan Lee:
Virtual Chinese tutor (VCT) - a Chinese language pronunciation learning software. SLaTE 2009 - 2008
- [j28]Wei-Ho Tsai, Hung-Ming Yu, Hsin-Min Wang:
Using the Similarity of Main Melodies to Identify Cover Versions of Popular Songs for Music Document Retrieval. J. Inf. Sci. Eng. 24(6): 1669-1687 (2008) - [j27]Yi-Hsiang Chao, Wei-Ho Tsai, Hsin-Min Wang, Ruei-Chuan Chang:
Using Kernel Discriminant Analysis to Improve the Characterization of the Alternative Hypothesis for Speaker Verification. IEEE Trans. Speech Audio Process. 16(8): 1675-1684 (2008) - [j26]Hung-Ming Yu, Wei-Ho Tsai, Hsin-Min Wang:
A Query-by-Singing System for Retrieving Karaoke Music. IEEE Trans. Multim. 10(8): 1626-1637 (2008) - [c73]Shih-Sian Cheng, Hsin-Min Wang, Hsin-Chia Fu:
BIC-based audio segmentation by divide-and-conquer. ICASSP 2008: 4841-4844 - [c72]Shih-Hsiang Lin, Yi-Ting Chen, Hsin-Min Wang, Bin Chen:
A comparative study of probabilistic ranking models for spoken document summarization. ICASSP 2008: 5025-5028 - [c71]I-Fan Chen, Hsin-Min Wang:
An Investigation of Phonological Feature Systems Used in Detection-Based ASR. ISCSLP 2008: 105-108 - [c70]Yi-Hsiang Chao, Wei-Ho Tsai, Hsin-Min Wang:
Discriminative Feedback Adaptation for GMM-UBM Speaker Verification. ISCSLP 2008: 169-172 - 2007
- [j25]Yi-Hsiang Chao, Hsin-Min Wang, Ruei-Chuan Chang:
A Novel Characterization of the Alternative Hypothesis Using Kernel Discriminant Analysis for LLR-Based Speaker Verification. Int. J. Comput. Linguistics Chin. Lang. Process. 12(3) (2007) - [j24]Hwai-Tsu Hu, Hsin-Min Wang:
Integrating coding techniques into LP-based Mandarin text-to-speech synthesis. Int. J. Speech Technol. 10(1): 31-44 (2007) - [j23]Wei-Ho Tsai, Shih-Sian Cheng, Hsin-Min Wang:
Automatic Speaker Clustering Using a Voice Characteristic Reference Space and Maximum Purity Estimation. IEEE Trans. Speech Audio Process. 15(4): 1461-1474 (2007) - [c69]Yi-Ting Chen, Shih-Hsiang Lin, Hsin-Min Wang, Berlin Chen:
Spoken document summarization using relevant information. ASRU 2007: 189-194 - [c68]Yi-Hsiang Chao, Wei-Ho Tsai, Hsin-Min Wang, Ruei-Chuan Chang:
Improved Methods for Characterizing the Alternative Hypothesis using Minimum Verification Error Training for LLR-Based Speaker Verification. ICASSP (4) 2007: 65-68 - [c67]Wei-Ho Tsai, Hsin-Min Wang:
Speaker Clustering Based on Minimum Rand Index. ICASSP (4) 2007: 485-488 - [c66]Hung-Yi Lo, Hsin-Min Wang:
Phonetic Boundary Refinement using Support Vector Machine. ICASSP (4) 2007: 933-936 - [c65]Ping-Han Lee, Lu-Jong Chu, Yi-Ping Hung, Sheng-Wen Shih, Chu-Song Chen, Hsin-Min Wang:
Cascading Multimodal Verification using Face, Voice and Iris Information. ICME 2007: 847-850 - [c64]Yi-Hsiang Chao, Wei-Ho Tsai, Shih-Sian Cheng, Hsin-Min Wang, Ruei-Chuan Chang:
Evolutionary minimum verification error learning of the alternative hypothesis model for LLR-based speaker verification. INTERSPEECH 2007: 2001-2004 - [c63]Jen-Wei Kuo, Hung-Yi Lo, Hsin-Min Wang:
Improved HMM/SVM methods for automatic phoneme segmentation. INTERSPEECH 2007: 2057-2060 - [c62]Yi-Ting Chen, Hsuan-Sheng Chiu, Hsin-Min Wang, Berlin Chen:
A unified probabilistic generative framework for extractive spoken document summarization. INTERSPEECH 2007: 2805-2808 - 2006
- [j22]Chuang-Hua Chueh, Hsin-Min Wang, Jen-Tzung Chien:
A Maximum Entropy Approach for Semantic Language Modeling. Int. J. Comput. Linguistics Chin. Lang. Process. 11(1) (2006) - [j21]Jen-Wei Kuo, Shih-Hung Liu, Hsin-Min Wang, Berlin Chen:
An Empirical Study of Word Error Minimization Approaches for Mandarin Large Vocabulary Continuous Speech Recognition. Int. J. Comput. Linguistics Chin. Lang. Process. 11(3) (2006) - [j20]Wei-Ho Tsai, Hsin-Min Wang:
Automatic singer recognition of popular music recordings via estimation and modeling of solo vocal signals. IEEE Trans. Speech Audio Process. 14(1): 330-341 (2006) - [c61]Hung-Ming Yu, Wei-Ho Tsai, Hsin-Min Wang:
A Music Retrieval System Based on Query-by-Singing for Karaoke Jukebox. AIRS 2006: 445-459 - [c60]Wei-Ho Tsai, Hsin-Min Wang:
On Maximizing the Within-Cluster Homogeneity of Speaker Voice Characteristics For Speech Utterance Clustering. ICASSP (1) 2006: 905-908 - [c59]Yi-Hsiang Chao, Wei-Ho Tsai, Hsin-Min Wang, Ruei-Chuan Chang:
A Kernel-based Discrimination Framework for Solving Hypothesis Testing Problems with Application to Speaker Verification. ICPR (4) 2006: 229-232 - [c58]Shih-Sian Cheng, Yi-Hsiang Chao, Hsin-Min Wang, Hsin-Chia Fu:
A Prototypes-Embedded Genetic K-means Algorithm. ICPR (2) 2006: 724-727 - [c57]Yi-Hsiang Chao, Wei-Ho Tsai, Hsin-Min Wang, Ruei-Chuan Chang:
Improving the characterization of the alternative hypothesis via kernel discriminant analysis for likelihood ratio-based speaker verification. INTERSPEECH 2006 - [c56]Jen-Wei Kuo, Hsin-Min Wang:
Minimum boundary error training for automatic phonetic segmentation. INTERSPEECH 2006 - [c55]Shih-Sian Cheng, Yeong-Yuh Xu, Hsin-Min Wang, Hsin-Chia Fu:
Automatic Construction of Regression Class Tree for MLLR Via Model-Based Hierarchical Clustering. ISCSLP (Selected Papers) 2006: 390-398 - [c54]Jen-Wei Kuo, Hsin-Min Wang:
A Minimum Boundary Error Framework for Automatic Phonetic Segmentation. ISCSLP (Selected Papers) 2006: 399-409 - [c53]Tzan-Hwei Chen, Berlin Chen, Hsin-Min Wang:
On Using Entropy Information to Improve Posterior Probability-Based Confidence Measures. ISCSLP (Selected Papers) 2006: 454-463 - [c52]Yi-Hsiang Chao, Hsin-Min Wang, Ruei-Chuan Chang:
A Novel Alternative Hypothesis Characterization Using Kernel Classifiers for LLR-Based Speaker Verification. ISCSLP (Selected Papers) 2006: 506-517 - [c51]Yi-Ting Chen, Suhan Yu, Hsin-Min Wang, Berlin Chen:
Extractive Chinese Spoken Document Summarization Using Probabilistic Ranking Models. ISCSLP (Selected Papers) 2006: 660-671 - 2005
- [j19]Hsin-Min Wang, Berlin Chen, Jen-Wei Kuo, Shih-Sian Cheng:
MATBN: A Mandarin Chinese Broadcast News Corpus. Int. J. Comput. Linguistics Chin. Lang. Process. 10(2) (2005) - [j18]Chiu-yu Tseng, Shao-huang Pin, Yehlin Lee, Hsin-Min Wang, Yong-cheng Chen:
Fluent speech prosody: Framework and modeling. Speech Commun. 46(3-4): 284-309 (2005) - [c50]Hung-Ming Yu, Wei-Ho Tsai, Hsin-Min Wang:
A Query-by-Singing Technique for Retrieving Polyphonic Objects of Popular Music. AIRS 2005: 439-453 - [c49]Yi-Hsiang Chao, Hsin-Min Wang, Ruei-Chuan Chang:
Gmm-Based Bhattacharyya Kernel Fisher Discriminant Analysis For Speaker Recognition. ICASSP (1) 2005: 649-652 - [c48]Wei-Ho Tsai, Shih-Sian Cheng, Yi-Hsiang Chao, Hsin-Min Wang:
Clustering Speech Utterances by Speaker Using Eigenvoice-Motivated Vector Space Models. ICASSP (1) 2005: 725-728 - [c47]Hsin-Min Wang, Wei-Ho Tsai, Hung-Ming Yu:
Prototype Systems for Retrieving Polyphonic Objects of Popular Music Based on Query-by-singing/example. ICDAT 2005: 265-266 - [c46]Hsin-Min Wang, Shih-Sian Cheng, Yong-cheng Chen:
SoVideo - A Mandarin Chinese Broadcast Retrieval System. ICDAT 2005: 267-268 - [c45]Hsien-Ting Cheng, Yi-Hsiang Chao, Shih-Liang Yeh, Chu-Song Chen, Hsin-Min Wang, Yi-Ping Hung:
An Efficient Approach to Multimodal Person Identity Verification by Fusing Face and Voice Information. ICME 2005: 542-545 - [c44]Wei-Ho Tsai, Hsin-Min Wang:
Speaker clustering of unknown utterances based on maximum purity estimation. INTERSPEECH 2005: 3069-3072 - [c43]Wei-Ho Tsai, Hung-Ming Yu, Hsin-Min Wang:
Query-By-Example Technique for Retrieving Cover Versions of Popular Songs with Similar Melodies. ISMIR 2005: 183-190 - [c42]Wei-Ho Tsai, Hsin-Min Wang:
On the extraction of vocal-related information to facilitate the management of popular music collections. JCDL 2005: 197-206 - 2004
- [j17]Wei-Ho Tsai, Dwight Rodgers, Hsin-Min Wang:
Blind Clustering of Popular Music Recordings Based on Singer Voice Characteristics. Comput. Music. J. 28(3): 68-78 (2004) - [j16]Helen M. Meng, Berlin Chen, Sanjeev Khudanpur, Gina-Anne Levow, Wai Kit Lo, Douglas W. Oard, Patrick Schone, Karen Tang, Hsin-Min Wang, Jianqiang Wang:
Mandarin-English Information (MEI): investigating translingual speech retrieval. Comput. Speech Lang. 18(2): 163-179 (2004) - [j15]Shih-Sian Cheng, Hsin-Min Wang, Hsin-Chia Fu:
A Model-Selection-Based Self-Splitting Gaussian Mixture Learning with Application to Speaker Identification. EURASIP J. Adv. Signal Process. 2004(17): 2626-2639 (2004) - [j14]Hsin-Min Wang, Shih-Sian Cheng, Yong-cheng Chen:
The SoVideo Mandarin Chinese Broadcast News Retrieval System. Int. J. Speech Technol. 7(2-3): 189-202 (2004) - [j13]Berlin Chen, Hsin-Min Wang, Lin-Shan Lee:
A discriminative HMM/N-gram-based retrieval approach for mandarin spoken documents. ACM Trans. Asian Lang. Inf. Process. 3(2): 128-145 (2004) - [c41]Wei-Ho Tsai, Hsin-Min Wang:
Automatic detection and tracking of target singer in multi-singer music recordings. ICASSP (4) 2004: 221-224 - [c40]Wei-Ho Tsai, Hsin-Min Wang:
A query-by-example framework to retrieve music documents by singer. ICME 2004: 1863-1866 - [c39]Hsin-Min Wang, Shih-Sian Cheng:
METRIC-SEQDAC: a hybrid approach for audio segmentation. INTERSPEECH 2004: 1617-1620 - [c38]Jen-Wei Kuo, Yao-Min Huang, Berlin Chen, Hsin-Min Wang:
Statistical Chinese spoken document retrieval using latent topical information. INTERSPEECH 2004: 1621-1624 - [c37]Wei-Ho Tsai, Shih-Sian Cheng, Hsin-Min Wang:
Speaker clustering of speech utterances using a voice characteristic reference space. INTERSPEECH 2004: 2937-2940 - [c36]Chih-Hsien Huang, Jen-Tzung Chien, Hsin-Min Wang:
A new eigenvoice approach to speaker adaptation. ISCSLP 2004: 109-112 - [c35]Shao-huang Pin, Yehlin Lee, Yong-cheng Chen, Hsin-Min Wang, Chiu-yu Tseng:
A Mandarin TTS system with an integrated prosodic model. ISCSLP 2004: 169-172 - [c34]Chuang-Hua Chueh, Jen-Tzung Chien, Hsin-Min Wang:
A maximum entropy approach for integrating semantic information in statistical language models. ISCSLP 2004: 309-312 - [c33]Wei-Ho Tsai, Hsin-Min Wang:
Towards Automatic Identification Of Singing Language In Popular Music Recordings. ISMIR 2004 - [c32]Yin-Cheng Chen, Tan-Hsu Tan, Hsin-Min Wang, Wei-Ho Tsai:
藍芽無線環境下中文語音辨識效能之評估與分析 (Performance Evaluation and Analysis of Mandarin Speech Recognition over Bluetooth Communication Environments) [In Chinese]. ROCLING 2004 - 2003
- [c31]Shih-Sian Cheng, Hsin-Min Wang:
A sequential metric-based audio segmentation method via the Bayesian information criterion. INTERSPEECH 2003: 945-948 - [c30]Wai Kit Lo, Yuk-Chi Li, Gina-Anne Levow, Hsin-Min Wang, Helen M. Meng:
Multi-scale document expansion in English-Mandarin cross-language spoken document retrieval. INTERSPEECH 2003: 2337-2340 - [c29]Wei-Ho Tsai, Hsin-Min Wang, Dwight Rodgers:
Automatic singer identification of popular music recordings via estimation and modeling of solo vocal signal. INTERSPEECH 2003: 2993-2996 - [c28]Wei-Ho Tsai, Hsin-Min Wang, Dwight Rodgers, Shih-Sian Cheng, Hung-Ming Yu:
Blind clustering of popular music recordings based on singer voice characteristics. ISMIR 2003 - 2002
- [j12]Bor-Shen Lin, Berlin Chen, Hsin-Min Wang, Lin-Shan Lee:
A hierarchical tag-graph search scheme with layered grammar rules for spontaneous speech understanding. Pattern Recognit. Lett. 23(7): 819-83 (2002) - [j11]Berlin Chen, Hsin-Min Wang, Lin-Shan Lee:
Discriminating capabilities of syllable-based features and approaches of utilizing them for voice retrieval of speech information in Mandarin Chinese. IEEE Trans. Speech Audio Process. 10(5): 303-314 (2002) - [c27]Mei-Fang Huang, Kuan-Ting Chen, Hsin-Min Wang:
Towards retrieval of video archives based on the speech content. ISCSLP 2002 - 2001
- [j10]Hsin-Min Wang, Berlin Chen:
Content-based Language Models for Spoken Document Retrieval. Int. J. Comput. Process. Orient. Lang. 14(2): 193-209 (2001) - [c26]Kuan-Ting Chen, Hsin-Min Wang:
Eigenspace-based maximum a posteriori linear regression for rapid speaker adaptation. ICASSP 2001: 317-320 - [c25]Hsin-Min Wang, Helen M. Meng, Patrick Schone, Berlin Chen, Wai-Kit Lo:
Multi-scale-audio indexing for translingual spoken document retrieval. ICASSP 2001: 605-608 - [c24]Berlin Chen, Hsin-Min Wang, Lin-Shan Lee:
Improved spoken document retrieval by exploring extra acoustic and linguistic cues. INTERSPEECH 2001: 299-302 - [c23]Berlin Chen, Hsin-Min Wang, Lin-Shan Lee:
An HMM/n-gram-based linguistic processing approach for Mandarin spoken document retrieval. INTERSPEECH 2001: 1045-1048 - [c22]Jeih-Weih Hung, Hsin-Min Wang, Lin-Shan Lee:
Comparative analysis for data-driven temporal filters obtained via principal component analysis (PCA) and linear discriminant analysis (LDA) in speech recognition. INTERSPEECH 2001: 1959-1962 - [c21]Helen M. Meng, Berlin Chen, Sanjeev Khudanpur, Gina-Anne Levow, Wai-Kit Lo, Douglas W. Oard, Patrick Schone, Karen Tang, Hsin-Min Wang, Jianqiang Wang:
Mandarin-English Information: Investigating Translingual Speech Retrieval. HLT 2001 - [c20]Hsin-Min Wang, Berlin Chen:
Comparison of Word and Subword Indexing Techniques for Mandarin Chinese Spoken Document Retrieval. IEEE Pacific Rim Conference on Multimedia 2001: 606-613 - 2000
- [j9]Hsin-Min Wang, Yu-Hsueh Chou, Berlin Chen:
Browsing the Chinese Web Pages Using Mandarin Speech. Int. J. Comput. Process. Orient. Lang. 13(1): 35-51 (2000) - [j8]Bo-Ren Bai, Berlin Chen, Hsin-Min Wang:
Syllable-Based Chinese Text/Spoken Document Retrieval Using Text/Speech Queries. Int. J. Pattern Recognit. Artif. Intell. 14(5): 603-616 (2000) - [j7]Lee-Feng Chien, Hsin-Min Wang, Bo-Ren Bai, Sun-Chien Lin:
A spoken-access approach for chinese text and speech information retrieval. J. Am. Soc. Inf. Sci. 51(4): 313-323 (2000) - [j6]Hsin-Min Wang:
Mandarin spoken document retrieval based on syllable lattice matching. Pattern Recognit. Lett. 21(6-7): 615-624 (2000) - [j5]Hsin-Min Wang:
Experiments in syllable-based retrieval of broadcast news speech in Mandarin Chinese. Speech Commun. 32(1-2): 49-60 (2000) - [c19]Berlin Chen, Hsin-Min Wang, Lin-Shan Lee:
Retrieval of broadcast news speech in Mandarin Chinese collected in Taiwan using syllable-level statistical characteristics. ICASSP 2000: 1771-1774 - [c18]Jeih-Weih Hung, Hsin-Min Wang, Lin-Shan Lee:
Automatic metric-based speech segmentation for broadcast news via principal component analysis. INTERSPEECH 2000: 121-124 - [c17]Berlin Chen, Hsin-Min Wang, Lin-Shan Lee:
Retrieval of mandarin broadcast news using spoken queries. INTERSPEECH 2000: 520-523 - [c16]Kuan-Ting Chen, Wen-Wei Liau, Hsin-Min Wang, Lin-Shan Lee:
Fast speaker adaptation using eigenspace-based maximum likelihood linear regression. INTERSPEECH 2000: 742-745 - [c15]Hsin-Min Wang, Berlin Chen:
Content-based language models for spoken document retrieval. IRAL 2000: 149-155 - [c14]Wei-Ping Hsieh, Berlin Chen, Kuan-Ting Chen, Hsin-Min Wang:
Initial Experiments On Recognition of Internet-Accessible Compressed Mandarin Speech. ISCSLP 2000
1990 – 1999
- 1999
- [j4]Jia-Lin Shen, Hsin-Min Wang, Ren-Yuan Lyu, Lin-Shan Lee:
Automatic selection of phonetically distributed sentence sets for speaker adaptation with application to large vocabulary Mandarin speech recognition. Comput. Speech Lang. 13(1): 79-97 (1999) - [c13]Bor-Shen Lin, Hsin-Min Wang, Lin-Shan Lee:
Consistent dialogue across concurrent topics based on an expert system model. EUROSPEECH 1999: 1427-1430 - [c12]Hsin-Min Wang:
A New Syllable-based Approach for Retrieving Mandarin Spoken Documents Using Short Speech Queries. ROCLING (1) 1999: 187-202 - 1998
- [j3]Hsin-Min Wang:
Statistical Analysis of Mandarin Acoustic Units and Automatic Extraction of Phonetically Rich Sentences Based Upon a very Large Chinese Text Corpus. Int. J. Comput. Linguistics Chin. Lang. Process. 3(2) (1998) - [c11]Berlin Chen, Hsin-Min Wang, Lee-Feng Chien, Lin-Shan Lee:
A*-admissible key-phrase spotting with sub-syllable level utterance verification. ICSLP 1998 - [c10]Bor-Shen Lin, Berlin Chen, Hsin-Min Wang, Lin-Shan Lee:
Hierarchical tag-graph search for spontaneous speech understanding in spoken dialog systems. ICSLP 1998 - [c9]Hsin-Min Wang, Bor-Shen Lin, Berlin Chen, Bo-Ren Bai:
Towards a Mandarin voice memo system. ICSLP 1998 - [c8]Bo-Ren Bai, Berlin Chen, Hsin-Min Wang, Lee-Feng Chien, Lin-Shan Lee:
Large-Vocabulary Chinese Text/Speech Information Retrieval Using Mandarin Speech Queries. ISCSLP 1998 - 1997
- [j2]Hsin-Min Wang, Tai-Hsuan Ho, Rung-Chiung Yang, Jia-Lin Shen, Bo-Ren Bai, Jenn-Chau Hong, Wei-Peng Chen, Tong-Lo Yu, Lin-Shan Lee:
Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary using limited training data. IEEE Trans. Speech Audio Process. 5(2): 195-200 (1997) - [c7]Lee-Feng Chien, Sung-Chien Lin, Jenn-Chau Hong, Ming-Chiuan Chen, Hsin-Min Wang, Jia-Lin Shen, Keh-Jiann Chen, Lin-Shan Lee:
Internet Chinese information retrieval using unconstrained Mandarin speech queries based on a client-server architecture and a PAT-tree-based language model. ICASSP 1997: 1155-1158 - 1996
- [j1]Chih-Heng Lin, Chien-Hsing Wu, Pei-Yih Ting, Hsin-Min Wang:
Frameworks for recognition of Mandarin syllables with tones using sub-syllabic units. Speech Commun. 18(2): 175-190 (1996) - 1995
- [c6]Hsin-Min Wang, Jia-Lin Shen, Yen-Ju Yang, Chiu-yu Tseng, Lin-Shan Lee:
Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary but limited training data. ICASSP 1995: 61-64 - [c5]Tai-Hsuan Ho, Hsin-Min Wang, Lee-Feng Chien, Keh-Jiann Chen, Lin-Shan Lee:
Fast and accurate continuous speech recognition for Chinese language with very large vocabulary. EUROSPEECH 1995: 211-214 - 1994
- [c4]Jia-Lin Shen, Hsin-Min Wang, Bo-Ren Bai, Lin-Shan Lee:
An initial study on a segmental probability model approach to large-vocabulary continuous Mandarin speech recognition. ICASSP (2) 1994: 133-136 - [c3]Jia-Lin Shen, Hsin-Min Wang, Ren-Yuan Lyu, Lin-Shan Lee:
Incremental speaker adaptation using phonetically balanced training sentences for Mandarin syllable recognition based on segmental probability models. ICSLP 1994: 443-446 - 1993
- [c2]Lin-Shan Lee, Chiu-yu Tseng, Keh-Jiann Chen, I-Jung Hung, Ming-Yu Lee, Lee-Feng Chien, Yumin Lee, Ren-Yuan Lyu, Hsin-Min Wang, Yung-Chuan Wu, Tung-Sheng Lin, Hung-Yan Gu, Chi-ping Nee, Chun-Yi Liao, Yeng-Ju Yang, Yuan-Cheng Chang, Rung-Chiung Yang:
Golden Mandarin (II)-an improved single-chip real-time Mandarin dictation machine for Chinese language with very large vocabulary. ICASSP (2) 1993: 503-506 - [c1]Hsin-Min Wang, Yuan-Cheng Chang, Lin-Shan Lee:
從中文語料庫中自動選取連續國語語音特性平衡句的方法 (Automatic Selection of Phonetically Rich Sentences from A Chinese Text Corpus) [In Chinese]. ROCLING 1993: 195-206
Coauthor Index
aka: Ryandhimas Edo Zezario
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-22 20:15 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint