default search action
Yuzhe Liang
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2026
- [i12]Yushen Chen, Junzhe Liu, Yujie Tu, Zhikang Niu, Yuzhe Liang, Kai Yu, Chunyu Qiang, Chen Zhang, Xie Chen:
Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis. CoRR abs/2601.13802 (2026) - [i11]Nolan Chan, Timmy Gang, Yongqian Wang, Yuzhe Liang, Dingdong Wang:
V2A-DPO: Omni-Preference Optimization for Video-to-Audio Generation. CoRR abs/2603.11089 (2026) - 2025
- [c6]Wenxi Chen, Ziyang Ma, Ruiqi Yan, Yuzhe Liang, Xiquan Li, Ruiyang Xu, Zhikang Niu, Yanqiao Zhu, Yifan Yang, Zhanxun Liu, Kai Yu, Yuxuan Hu, Jinyu Li, Yan Lu, Shujie Liu, Xie Chen:
SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training. ACL (Findings) 2025: 2262-2282 - [c5]Wenxi Chen, Ziyang Ma, Xiquan Li, Xuenan Xu, Yuzhe Liang, Zhisheng Zheng, Kai Yu, Xie Chen:
SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs. ICASSP 2025: 1-5 - [c4]Xiquan Li, Wenxi Chen, Ziyang Ma, Xuenan Xu, Yuzhe Liang, Zhisheng Zheng, Qiuqiang Kong, Xie Chen:
DRCap: Decoding CLAP Latents with Retrieval-Augmented Generation for Zero-shot Audio Captioning. ICASSP 2025: 1-5 - [i10]Ziyang Ma, Yinghao Ma, Yanqiao Zhu, Chen Yang, Yi-Wen Chao, Ruiyang Xu, Wenxi Chen, Yuanzhe Chen, Zhuo Chen, Jian Cong, Kai Li, Keliang Li, Siyou Li, Xinfeng Li, Xiquan Li, Zheng Lian, Yuzhe Liang, Minghao Liu, Zhikang Niu, Tianrui Wang, Yuping Wang, Yuxuan Wang, Yihao Wu, Guanrou Yang, Jianwei Yu, Ruibin Yuan, Zhisheng Zheng, Ziya Zhou, Haina Zhu, Wei Xue, Emmanouil Benetos, Kai Yu, Chng Eng Siong, Xie Chen:
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix. CoRR abs/2505.13032 (2025) - [i9]Jun Wang, Xijuan Zeng, Chunyu Qiang, Ruilong Chen, Shiyao Wang, Le Wang, Wangjing Zhou, Pengfei Cai, Jiahui Zhao, Nan Li, Zihan Li, Yuzhe Liang, Xiaopeng Wang, Haorui Zheng, Ming Wen, Kang Yin, Yiran Wang, Nan Li, Feng Deng, Liang Dong, Chen Zhang, Di Zhang, Kun Gai:
Kling-Foley: Multimodal Diffusion Transformer for High-Quality Video-to-Audio Generation. CoRR abs/2506.19774 (2025) - [i8]Xiquan Li, Junxi Liu, Yuzhe Liang, Zhikang Niu, Wenxi Chen, Xie Chen:
MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows. CoRR abs/2508.06098 (2025) - [i7]Wenxi Chen, Xinsheng Wang, Ruiqi Yan, Yushen Chen, Zhikang Niu, Ziyang Ma, Xiquan Li, Yuzhe Liang, Hanlin Wen, Shunshun Yin, Ming Tao, Xie Chen:
SAC: Neural Speech Codec with Semantic-Acoustic Dual-Stream Quantization. CoRR abs/2510.16841 (2025) - [i6]Chunyu Qiang, Kang Yin, Xiaopeng Wang, Yuzhe Liang, Jiahui Zhao, Ruibo Fu, Tianrui Wang, Cheng Gong, Chen Zhang, Longbiao Wang, Jianwu Dang:
InstructAudio: Unified speech and music generation with natural language instruction. CoRR abs/2511.18487 (2025) - [i5]Xiaopeng Wang, Chunyu Qiang, Ruibo Fu, Zhengqi Wen, Xuefei Liu, Yukun Liu, Yuzhe Liang, Kang Yin, Yuankun Xie, Heng Xie, Chenxing Li, Chen Zhang, Changsheng Li:
M3-TTS: Multi-modal DiT Alignment & Mel-latent for Zero-shot High-fidelity Speech Synthesis. CoRR abs/2512.04720 (2025) - [i4]Kang Yin, Chunyu Qiang, Sirui Zhao, Xiaopeng Wang, Yuzhe Liang, Pengfei Cai, Tong Xu
, Chen Zhang, Enhong Chen:
DMP-TTS: Disentangled multi-modal Prompting for Controllable Text-to-Speech with Chained Guidance. CoRR abs/2512.09504 (2025) - 2024
- [j1]Yuzhe Liang:
Application of multi-scale dynamic enhancement based on deep neural network and CT urinary tract secretory phase image fusion in the diagnosis of urinary system diseases. BMC Bioinform. 25(1) (2024) - [c3]Wen Huang
, Anbai Jiang, Bing Han, Xinhu Zheng, Yihong Qiu, Wenxi Chen, Yuzhe Liang, Pingyi Fan
, Wei-Qiang Zhang, Cheng Lu, Xie Chen, Jia Liu, Yanmin Qian:
Semi-Supervised Acoustic Scene Classification with Test-Time Adaptation. ICME Workshops 2024: 1-5 - [c2]Yuzhe Liang, Wenxi Chen, Anbai Jiang, Yihong Qiu, Xinhu Zheng, Wen Huang
, Bing Han, Yanmin Qian, Pingyi Fan
, Wei-Qiang Zhang, L. Cheng, Jia Liu, Xie Chen:
Improving Acoustic Scene Classification via Self-Supervised and Semi-Supervised Learning with Efficient Audio Transformer. ICME Workshops 2024: 1-6 - [c1]Wenxi Chen, Yuzhe Liang, Ziyang Ma, Zhisheng Zheng, Xie Chen:
EAT: Self-Supervised Pre-Training with Efficient Audio Transformer. IJCAI 2024: 3807-3815 - [i3]Wenxi Chen, Yuzhe Liang, Ziyang Ma, Zhisheng Zheng, Xie Chen:
EAT: Self-Supervised Pre-Training with Efficient Audio Transformer. CoRR abs/2401.03497 (2024) - [i2]Xiquan Li, Wenxi Chen, Ziyang Ma, Xuenan Xu, Yuzhe Liang, Zhisheng Zheng, Qiuqiang Kong, Xie Chen:
DRCap: Decoding CLAP Latents with Retrieval-augmented Generation for Zero-shot Audio Captioning. CoRR abs/2410.09472 (2024) - [i1]Wenxi Chen, Ziyang Ma, Xiquan Li, Xuenan Xu, Yuzhe Liang, Zhisheng Zheng, Kai Yu, Xie Chen:
SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs. CoRR abs/2410.09503 (2024)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2026-04-11 00:28 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint