default search action
Hao Tang 0002
Person information
- affiliation: University of Edinburgh, School of Informatics, UK
- affiliation: Massachusetts Institute of Technology, Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA
- affiliation (PhD 2017): Toyota Technological Institute at Chicago, IL, USA
- affiliation (former): National Taiwan University, Taipei, Taiwan
Other persons with the same name
- Hao Tang — disambiguation page
- Hao Tang 0001 — Hewlett-Packard Labs., Multimedia Interaction & Understanding Laboratory, Palo Alto, CA, USA (and 3 more)
- Hao Tang 0003 — Tsinghua University, Department of Philosophy, Beijing, China (and 4 more)
- Hao Tang 0004 — Hainan University, School of Information and Communication Engineering, Haikou, China (and 1 more)
- Hao Tang 0005 — ETH Zurich, Switzerland (and 2 more)
- Hao Tang 0006 — Central South University, Second Xiangya Hospital, Changsha, China
- Hao Tang 0007 — Nanjing University of Science and Technology, China
- Hao Tang 0008 — Cornell University, Ithaca, NY, USA (and 3 more)
- Hao Tang 0009 — Hefei University of Technology, School of Electrical Engineering and Automation, China
- Hao Tang 0010 — University of California, Irvine, Department of Computer Science, CA, USA
- Hao Tang 0011 — City University of New York, CUNY, City College, Department of Computer Science, Borough of Manhattan Community College, NY, USA
- Hao Tang 0012 — Wuhan University, Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, China
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [i30]Tzu-Quan Lin, Hung-yi Lee, Hao Tang:
DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models. CoRR abs/2406.05464 (2024) - [i29]Tzu-Quan Lin, Guan-Ting Lin, Hung-yi Lee, Hao Tang:
Property Neurons in Self-Supervised Speech Transformers. CoRR abs/2409.05910 (2024) - [i28]Sung-Lin Yeh, Hao Tang:
Estimating the Completeness of Discrete Speech Units. CoRR abs/2409.06109 (2024) - [i27]Gene-Ping Yang, Hao Tang:
A Simple HMM with Self-Supervised Representations for Phone Segmentation. CoRR abs/2409.09646 (2024) - 2023
- [j6]Siqi Sun, Korin Richmond, Hao Tang:
Improving Seq2Seq TTS Frontends With Transcribed Speech Audio. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1940-1952 (2023) - [c29]Tzu-Quan Lin, Hung-Yi Lee, Hao Tang:
MelHuBERT: A Simplified Hubert on Mel Spectrograms. ASRU 2023: 1-8 - [c28]Gene-Ping Yang, Hao Tang:
Towards Matching Phones and Speech Representations. ASRU 2023: 1-8 - [c27]Sung-Lin Yeh, Hao Tang:
Learning Dependencies of Discrete Speech Representations with Neural Hidden Markov Models. ICASSP 2023: 1-5 - [c26]Chin-Yun Yu, Sung-Lin Yeh, György Fazekas, Hao Tang:
Conditioning and Sampling in Variational Diffusion Models for Speech Super-Resolution. ICASSP 2023: 1-5 - [i26]Gene-Ping Yang, Hao Tang:
Towards Matching Phones and Speech Representations. CoRR abs/2310.17558 (2023) - 2022
- [j5]Gene-Ping Yang, Sung-Lin Yeh, Yu-An Chung, James R. Glass, Hao Tang:
Autoregressive Predictive Coding: A Comprehensive Study. IEEE J. Sel. Top. Signal Process. 16(6): 1380-1390 (2022) - [c25]Gene-Ping Yang, Hao Tang:
Supervised Attention in Sequence-to-Sequence Models for Speech Recognition. ICASSP 2022: 7222-7226 - [c24]Dan Wells, Hao Tang, Korin Richmond:
Phonetic Analysis of Self-supervised Representations of English Speech. INTERSPEECH 2022: 3583-3587 - [c23]Sung-Lin Yeh, Hao Tang:
Autoregressive Co-Training for Learning Discrete Speech Representation. INTERSPEECH 2022: 5000-5004 - [c22]Yen Meng, Hsuan-Jui Chen, Jiatong Shi, Shinji Watanabe, Paola García, Hung-yi Lee, Hao Tang:
On Compressing Sequences for Self-Supervised Speech Models. SLT 2022: 1128-1135 - [i25]Sung-Lin Yeh, Hao Tang:
Autoregressive Co-Training for Learning Discrete Speech Representations. CoRR abs/2203.15840 (2022) - [i24]Gene-Ping Yang, Hao Tang:
Supervised Attention in Sequence-to-Sequence Models for Speech Recognition. CoRR abs/2204.12308 (2022) - [i23]Yen Meng, Hsuan-Jui Chen, Jiatong Shi, Shinji Watanabe, Paola García, Hung-yi Lee, Hao Tang:
On Compressing Sequences for Self-Supervised Speech Models. CoRR abs/2210.07189 (2022) - [i22]Chin-Yun Yu, Sung-Lin Yeh, György Fazekas, Hao Tang:
Conditioning and Sampling in Variational Diffusion Models for Speech Super-resolution. CoRR abs/2210.15793 (2022) - [i21]Sung-Lin Yeh, Hao Tang:
Learning Dependencies of Discrete Speech Representations with Neural Hidden Markov Models. CoRR abs/2210.16659 (2022) - [i20]Tzu-Quan Lin, Hung-yi Lee, Hao Tang:
MelHuBERT: A simplified HuBERT on Mel spectrogram. CoRR abs/2211.09944 (2022) - [i19]Tzu-Quan Lin, Tsung-Huan Yang, Chun-Yao Chang, Kuang-Ming Chen, Tzu-hsun Feng, Hung-yi Lee, Hao Tang:
Compressing Transformer-based self-supervised models for speech processing. CoRR abs/2211.09949 (2022) - 2020
- [c21]François Grondin, Hao Tang, James R. Glass:
Audio-Visual Calibration with Polynomial Regression for 2-D Projection Using SVD-PHAT. ICASSP 2020: 4856-4860 - [c20]Yu-An Chung, Hao Tang, James R. Glass:
Vector-Quantized Autoregressive Predictive Coding. INTERSPEECH 2020: 3760-3764 - [i18]François Grondin, Hao Tang, James R. Glass:
Audio-Visual Calibration with Polynomial Regression for 2-D Projection Using SVD-PHAT. CoRR abs/2002.01440 (2020) - [i17]Yu-An Chung, Hao Tang, James R. Glass:
Vector-Quantized Autoregressive Predictive Coding. CoRR abs/2005.08392 (2020)
2010 – 2019
- 2019
- [j4]Achintya Kumar Sarkar, Zheng-Hua Tan, Hao Tang, Suwon Shon, James R. Glass:
Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification. IEEE ACM Trans. Audio Speech Lang. Process. 27(8): 1267-1279 (2019) - [c19]Yu-An Chung, Wei-Ning Hsu, Hao Tang, James R. Glass:
An Unsupervised Autoregressive Model for Speech Representation Learning. INTERSPEECH 2019: 146-150 - [c18]Logan Ford, Hao Tang, François Grondin, James R. Glass:
A Deep Residual Network for Large-Scale Acoustic Scene Analysis. INTERSPEECH 2019: 2568-2572 - [c17]Suwon Shon, Hao Tang, James R. Glass:
VoiceID Loss: Speech Enhancement for Speaker Verification. INTERSPEECH 2019: 2888-2892 - [i16]Yu-An Chung, Wei-Ning Hsu, Hao Tang, James R. Glass:
An Unsupervised Autoregressive Model for Speech Representation Learning. CoRR abs/1904.03240 (2019) - [i15]Suwon Shon, Hao Tang, James R. Glass:
VoiceID Loss: Speech Enhancement for Speaker Verification. CoRR abs/1904.03601 (2019) - [i14]Achintya Kumar Sarkar, Zheng-Hua Tan, Hao Tang, Suwon Shon, James R. Glass:
Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification. CoRR abs/1905.04554 (2019) - 2018
- [c16]Wei-Ning Hsu, Hao Tang, James R. Glass:
Unsupervised Adaptation with Interpretable Disentangled Representations for Distant Conversational Speech Recognition. INTERSPEECH 2018: 1576-1580 - [c15]Hao Tang, Wei-Ning Hsu, François Grondin, James R. Glass:
A Study of Enhancement, Augmentation and Autoencoder Methods for Domain Adaptation in Distant Speech Recognition. INTERSPEECH 2018: 2928-2932 - [c14]Hao Tang, James R. Glass:
On Training Recurrent Networks with Truncated Backpropagation Through time in Speech Recognition. SLT 2018: 48-55 - [c13]Suwon Shon, Hao Tang, James R. Glass:
Frame-Level Speaker Embeddings for Text-Independent Speaker Recognition and Analysis of End-to-End Model. SLT 2018: 1007-1013 - [i13]Hao Tang, Wei-Ning Hsu, François Grondin, James R. Glass:
A Study of Enhancement, Augmentation, and Autoencoder Methods for Domain Adaptation in Distant Speech Recognition. CoRR abs/1806.04841 (2018) - [i12]Wei-Ning Hsu, Hao Tang, James R. Glass:
Unsupervised Adaptation with Interpretable Disentangled Representations for Distant Conversational Speech Recognition. CoRR abs/1806.04872 (2018) - [i11]Hao Tang, James R. Glass:
On Training Recurrent Networks with Truncated Backpropagation Through Time in Speech Recognition. CoRR abs/1807.03396 (2018) - [i10]Suwon Shon, Hao Tang, James R. Glass:
Frame-level speaker embeddings for text-independent speaker recognition and analysis of end-to-end model. CoRR abs/1809.04437 (2018) - [i9]Hao Tang, James R. Glass:
On The Inductive Bias of Words in Acoustics-to-Word Models. CoRR abs/1810.13407 (2018) - 2017
- [j3]Taehwan Kim, Jonathan Keane, Weiran Wang, Hao Tang, Jason Riggle, Gregory Shakhnarovich, Diane Brentari, Karen Livescu:
Lexicon-free fingerspelling recognition from video: Data, models, and signer adaptation. Comput. Speech Lang. 46: 209-232 (2017) - [j2]Hao Tang, Liang Lu, Lingpeng Kong, Kevin Gimpel, Karen Livescu, Chris Dyer, Noah A. Smith, Steve Renals:
End-to-End Neural Segmental Models for Speech Recognition. IEEE J. Sel. Top. Signal Process. 11(8): 1254-1264 (2017) - [j1]Mark A. Hasegawa-Johnson, Preethi Jyothi, Daniel McCloy, Majid Mirbagheri, Giovanni M. Di Liberto, Amit Das, Bradley Ekin, Chunxi Liu, Vimal Manohar, Hao Tang, Edmund C. Lalor, Nancy F. Chen, Paul Hager, Tyler Kekona, Rose Sloan, Adrian K. C. Lee:
ASR for Under-Resourced Languages From Probabilistic Transcription. IEEE ACM Trans. Audio Speech Lang. Process. 25(1): 46-59 (2017) - [c12]Shubham Toshniwal, Hao Tang, Liang Lu, Karen Livescu:
Multitask Learning with Low-Level Auxiliary Tasks for Encoder-Decoder Based Speech Recognition. INTERSPEECH 2017: 3532-3536 - [i8]Shubham Toshniwal, Hao Tang, Liang Lu, Karen Livescu:
Multitask Learning with Low-Level Auxiliary Tasks for Encoder-Decoder Based Speech Recognition. CoRR abs/1704.01631 (2017) - [i7]Hao Tang, Liang Lu, Lingpeng Kong, Kevin Gimpel, Karen Livescu, Chris Dyer, Noah A. Smith, Steve Renals:
End-to-End Neural Segmental Models for Speech Recognition. CoRR abs/1708.00531 (2017) - [i6]Hao Tang:
Sequence Prediction with Neural Segmental Models. CoRR abs/1709.01572 (2017) - 2016
- [c11]Chunxi Liu, Preethi Jyothi, Hao Tang, Vimal Manohar, Rose Sloan, Tyler Kekona, Mark Hasegawa-Johnson, Sanjeev Khudanpur:
Adapting ASR for under-resourced languages using mismatched transcriptions. ICASSP 2016: 5840-5844 - [c10]Taehwan Kim, Weiran Wang, Hao Tang, Karen Livescu:
Signer-independent fingerspelling recognition with deep neural network adaptation. ICASSP 2016: 6160-6164 - [c9]Hao Tang, Weiran Wang, Kevin Gimpel, Karen Livescu:
Efficient Segmental Cascades for Speech Recognition. INTERSPEECH 2016: 1903-1907 - [c8]Weiran Wang, Hao Tang, Karen Livescu:
Triphone State-Tying via Deep Canonical Correlation Analysis. INTERSPEECH 2016: 3444-3448 - [c7]Hao Tang, Weiran Wang, Kevin Gimpel, Karen Livescu:
End-to-end training approaches for discriminative segmental models. SLT 2016: 496-502 - [i5]Taehwan Kim, Weiran Wang, Hao Tang, Karen Livescu:
Signer-independent Fingerspelling Recognition with Deep Neural Network Adaptation. CoRR abs/1602.04278 (2016) - [i4]Hao Tang, Weiran Wang, Kevin Gimpel, Karen Livescu:
Efficient Segmental Cascades for Speech Recognition. CoRR abs/1608.00929 (2016) - [i3]Taehwan Kim, Jonathan Keane, Weiran Wang, Hao Tang, Jason Riggle, Gregory Shakhnarovich, Diane Brentari, Karen Livescu:
Lexicon-Free Fingerspelling Recognition from Video: Data, Models, and Signer Adaptation. CoRR abs/1609.07876 (2016) - [i2]Hao Tang, Weiran Wang, Kevin Gimpel, Karen Livescu:
End-to-End Training Approaches for Discriminative Segmental Models. CoRR abs/1610.06700 (2016) - 2015
- [c6]Hao Tang, Weiran Wang, Kevin Gimpel, Karen Livescu:
Discriminative segmental cascades for feature-rich phone recognition. ASRU 2015: 561-568 - [i1]Hao Tang, Weiran Wang, Kevin Gimpel, Karen Livescu:
Discriminative Segmental Cascades for Feature-Rich Phone Recognition. CoRR abs/1507.06073 (2015) - 2014
- [c5]Hao Tang, Shinji Watanabe, Tim K. Marks, John R. Hershey:
Log-linear dialog manager. ICASSP 2014: 4092-4096 - [c4]Hao Tang, Kevin Gimpel, Karen Livescu:
A comparison of training approaches for discriminative segmental models. INTERSPEECH 2014: 1219-1223 - 2012
- [c3]Hao Tang, Joseph Keshet, Karen Livescu:
Discriminative Pronunciation Modeling: A Large-Margin, Feature-Rich Approach. ACL (1) 2012: 194-203 - 2010
- [c2]Hao Tang, Chao-Hong Meng, Lin-Shan Lee:
An initial attempt for phoneme recognition using Structured Support Vector Machine (SVM). ICASSP 2010: 4926-4929
2000 – 2009
- 2009
- [c1]Hung-yi Lee, Yueh-Lien Tang, Hao Tang, Lin-Shan Lee:
Spoken term detection from bilingual spontaneous speech using code-switched lattice-based structures for words and subword units. ASRU 2009: 410-415
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-28 21:16 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint