default search action
Zexu Pan
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j7]Wupeng Wang, Zexu Pan, Xinke Li, Shuai Wang, Haizhou Li:
Speech Separation With Pretrained Frontend to Minimize Domain Mismatch. IEEE ACM Trans. Audio Speech Lang. Process. 32: 4184-4198 (2024) - [j6]Zexu Pan, Marvin Borsdorf, Siqi Cai, Tanja Schultz, Haizhou Li:
NeuroHeed: Neuro-Steered Speaker Extraction Using EEG Signals. IEEE ACM Trans. Audio Speech Lang. Process. 32: 4456-4470 (2024) - [j5]Shuo Zhang, Zexu Pan, Yichang Lv, Youfang Lin:
Hierarchical Edge Refinement Network for Guided Depth Map Super-Resolution. IEEE Trans. Computational Imaging 10: 469-478 (2024) - [c20]Jiadong Wang, Zexu Pan, Malu Zhang, Robby T. Tan, Haizhou Li:
Restoring Speaking Lips from Occlusion for Audio-Visual Speech Recognition. AAAI 2024: 19144-19152 - [c19]Zexu Pan, Gordon Wichern, François G. Germain, Aswin Shanmugam Subramanian, Jonathan Le Roux:
Late Audio-Visual Fusion for in-the-Wild Speaker Diarization. ICASSP Workshops 2024: 174-178 - [c18]Yoshiki Masuyama, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux:
NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization. ICASSP 2024: 1016-1020 - [c17]Dimitrios Bralios, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux:
Generation or Replication: Auscultating Audio Latent Diffusion Models. ICASSP 2024: 1156-1160 - [c16]Xinyuan Qian, Zexu Pan, Qiquan Zhang, Kainan Chen, Shoufeng Lin:
GLMB 3D Speaker Tracking with Video-Assisted Multi-Channel Audio Optimization Functions. ICASSP 2024: 8100-8104 - [c15]Yu Chen, Xinyuan Qian, Zexu Pan, Kainan Chen, Haizhou Li:
LOCSELECT: Target Speaker Localization with an Auditory Selective Hearing Mechanism. ICASSP 2024: 8696-8700 - [c14]Junjie Li, Ruijie Tao, Zexu Pan, Meng Ge, Shuai Wang, Haizhou Li:
Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-Talker Speech. ICASSP 2024: 10666-10670 - [c13]Zexu Pan, Gordon Wichern, François G. Germain, Sameer Khurana, Jonathan Le Roux:
NeuroHeed+: Improving Neuro-Steered Speaker Extraction with Joint Auditory Attention Detection. ICASSP 2024: 11456-11460 - [c12]Kohei Saijo, Gordon Wichern, François G. Germain, Zexu Pan, Jonathan Le Roux:
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement. IWAENC 2024: 205-209 - [i21]Yoshiki Masuyama, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux:
NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization. CoRR abs/2402.17907 (2024) - [i20]Kohei Saijo, Gordon Wichern, François G. Germain, Zexu Pan, Jonathan Le Roux:
Enhanced Reverberation as Supervision for Unsupervised Speech Separation. CoRR abs/2408.03438 (2024) - [i19]Kohei Saijo, Gordon Wichern, François G. Germain, Zexu Pan, Jonathan Le Roux:
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement. CoRR abs/2408.03440 (2024) - [i18]Kun Zhou, You Zhang, Shengkui Zhao, Hao Wang, Zexu Pan, Dianwen Ng, Chong Zhang, Chongjia Ni, Yukun Ma, Trung Hieu Nguyen, Jia Qi Yip, Bin Ma:
Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions. CoRR abs/2409.16681 (2024) - 2023
- [j4]Tingting Wang, Zexu Pan, Meng Ge, Zhen Yang, Haizhou Li:
Time-Domain Speech Separation Networks With Graph Encoding Auxiliary. IEEE Signal Process. Lett. 30: 110-114 (2023) - [c11]Zexu Pan, Gordon Wichern, Yoshiki Masuyama, François G. Germain, Sameer Khurana, Chiori Hori, Jonathan Le Roux:
Scenario-Aware Audio-Visual TF-Gridnet for Target Speech Extraction. ASRU 2023: 1-8 - [c10]Zexu Pan, Wupeng Wang, Marvin Borsdorf, Haizhou Li:
ImagineNet: Target Speaker Extraction with Intermittent Visual Cue Through Embedding Inpainting. ICASSP 2023: 1-5 - [c9]Yidi Jiang, Ruijie Tao, Zexu Pan, Haizhou Li:
Target Active Speaker Detection with Audio-visual Cues. INTERSPEECH 2023: 3152-3156 - [c8]Ke Zhang, Marvin Borsdorf, Zexu Pan, Haizhou Li, Yangjie Wei, Yi Wang:
Speaker Extraction with Detection of Presence and Absence of Target Speakers. INTERSPEECH 2023: 3714-3718 - [c7]Junjie Li, Meng Ge, Zexu Pan, Rui Cao, Longbiao Wang, Jianwu Dang, Shiliang Zhang:
Rethinking the Visual Cues in Audio-Visual Speaker Extraction. INTERSPEECH 2023: 3754-3758 - [i17]Yidi Jiang, Ruijie Tao, Zexu Pan, Haizhou Li:
Target Active Speaker Detection with Audio-visual Cues. CoRR abs/2305.12831 (2023) - [i16]Junjie Li, Meng Ge, Zexu Pan, Rui Cao, Longbiao Wang, Jianwu Dang, Shiliang Zhang:
Rethinking the visual cues in audio-visual speaker extraction. CoRR abs/2306.02625 (2023) - [i15]Junjie Li, Ruijie Tao, Zexu Pan, Meng Ge, Shuai Wang, Haizhou Li:
Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-talker Speech. CoRR abs/2309.08408 (2023) - [i14]Yu Chen, Xinyuan Qian, Zexu Pan, Kainan Chen, Haizhou Li:
LocSelect: Target Speaker Localization with an Auditory Selective Hearing Mechanism. CoRR abs/2310.10497 (2023) - [i13]Dimitrios Bralios, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux:
Generation or Replication: Auscultating Audio Latent Diffusion Models. CoRR abs/2310.10604 (2023) - [i12]Zexu Pan, Gordon Wichern, Yoshiki Masuyama, François G. Germain, Sameer Khurana, Chiori Hori, Jonathan Le Roux:
Scenario-Aware Audio-Visual TF-GridNet for Target Speech Extraction. CoRR abs/2310.19644 (2023) - [i11]Zexu Pan, Gordon Wichern, François G. Germain, Sameer Khurana, Jonathan Le Roux:
NeuroHeed+: Improving Neuro-steered Speaker Extraction with Joint Auditory Attention Detection. CoRR abs/2312.07513 (2023) - 2022
- [j3]Zexu Pan, Xinyuan Qian, Haizhou Li:
Speaker Extraction With Co-Speech Gestures Cue. IEEE Signal Process. Lett. 29: 1467-1471 (2022) - [j2]Zexu Pan, Ruijie Tao, Chenglin Xu, Haizhou Li:
Selective Listening by Synchronizing Speech With Lips. IEEE ACM Trans. Audio Speech Lang. Process. 30: 1650-1664 (2022) - [j1]Zexu Pan, Meng Ge, Haizhou Li:
USEV: Universal Speaker Extraction With Visual Cue. IEEE ACM Trans. Audio Speech Lang. Process. 30: 3032-3045 (2022) - [c6]Junjie Li, Meng Ge, Zexu Pan, Longbiao Wang, Jianwu Dang:
VCSE: Time-Domain Visual-Contextual Speaker Extraction Network. INTERSPEECH 2022: 906-910 - [c5]Zexu Pan, Meng Ge, Haizhou Li:
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction. INTERSPEECH 2022: 1786-1790 - [i10]Zexu Pan, Xinyuan Qian, Haizhou Li:
Speaker Extraction with Co-Speech Gestures Cue. CoRR abs/2203.16840 (2022) - [i9]Zexu Pan, Meng Ge, Haizhou Li:
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction. CoRR abs/2203.16843 (2022) - [i8]Junjie Li, Meng Ge, Zexu Pan, Longbiao Wang, Jianwu Dang:
VCSE: Time-Domain Visual-Contextual Speaker Extraction Network. CoRR abs/2210.06177 (2022) - [i7]Zexu Pan, Wupeng Wang, Marvin Borsdorf, Haizhou Li:
ImagineNET: Target Speaker Extraction with Intermittent Visual Cue through Embedding Inpainting. CoRR abs/2211.00109 (2022) - [i6]Zexu Pan, Gordon Wichern, François G. Germain, Aswin Shanmugam Subramanian, Jonathan Le Roux:
Towards End-to-end Speaker Diarization in the Wild. CoRR abs/2211.01299 (2022) - 2021
- [c4]Xinyuan Qian, Maulik C. Madhavi, Zexu Pan, Jiadong Wang, Haizhou Li:
Multi-Target DoA Estimation with an Audio-Visual Fusion Mechanism. ICASSP 2021: 4280-4284 - [c3]Zexu Pan, Ruijie Tao, Chenglin Xu, Haizhou Li:
Muse: Multi-Modal Target Speaker Extraction with Visual Cues. ICASSP 2021: 6678-6682 - [c2]Ruijie Tao, Zexu Pan, Rohan Kumar Das, Xinyuan Qian, Mike Zheng Shou, Haizhou Li:
Is Someone Speaking?: Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection. ACM Multimedia 2021: 3927-3935 - [i5]Xinyuan Qian, Maulik C. Madhavi, Zexu Pan, Jiadong Wang, Haizhou Li:
Multi-target DoA Estimation with an Audio-visual Fusion Mechanism. CoRR abs/2105.06107 (2021) - [i4]Ruijie Tao, Zexu Pan, Rohan Kumar Das, Xinyuan Qian, Mike Zheng Shou, Haizhou Li:
Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection. CoRR abs/2107.06592 (2021) - [i3]Zexu Pan, Meng Ge, Haizhou Li:
USEV: Universal Speaker Extraction with Visual Cue. CoRR abs/2109.14831 (2021) - 2020
- [c1]Zexu Pan, Zhaojie Luo, Jichen Yang, Haizhou Li:
Multi-Modal Attention for Speech Emotion Recognition. INTERSPEECH 2020: 364-368 - [i2]Zexu Pan, Zhaojie Luo, Jichen Yang, Haizhou Li:
Multi-modal Attention for Speech Emotion Recognition. CoRR abs/2009.04107 (2020) - [i1]Zexu Pan, Ruijie Tao, Chenglin Xu, Haizhou Li:
Muse: Multi-modal target speaker extraction with visual cues. CoRR abs/2010.07775 (2020)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-30 21:33 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint