Unsupervised expert alignment and importance-guided layer chunking merge multiple fine-tuned experts into one generalist model.
Dengming Zhang
Hello there, I'm Dengming Zhang, and I received my master's degree from Zhejiang University in March 2026.
My primary research focuses on Multimodal Large Models under low-data and low-compute constraints. On the low-data side, I study how to merge multiple domain-specialized, fine-tuned expert LLMs into one generalist model using only 1–5 samples, while retaining SOTA-level performance (ICLR 2026). On the low-compute side, I explore how to equip vision foundation models with audio using a single RTX 4090, and improve audio-visual affective understanding to a SOTA-level. I am also interested in Generative AI (Image/Music), Affective Computing, Meta-learning, and HCI.
By the way, I am good at combining scientific research with engineering implementation, and I have rich experience in front-end development, back-end development, and cluster devops. Some of the open source projects that I lead/participate in can be found on my GitHub.
Seeking PhD opportunities for Fall 2026
News
Research Highlights
Grouped by research direction. Click the venue link in News to jump here.
Multimodal Large Models
3 PaperAudio-visual emotion understanding by teaching vision-language models to align sight and sound for artistic emotion.
Buffering-based spatial sparsity improves centrifugal token pruning efficiency in vision-language models under aggressive pruning rates.
Affective Computing & Music Emotion
1 PaperDual-scale attention meta-learning for personalized, dynamic music emotion recognition.
Meta-learning & Diagnosis
1 PaperDiscriminant space optimization improves few-shot bearing fault diagnosis with meta-learning.
Controllable Generation & Creative AI
4 PaperStyle-strength control and evaluation to improve style alignment in image creation.
Controllable text rendering with typography and style controls.
Generates music from video with multiple time-varying conditioning signals.
Decomposes spatial and temporal cues to improve controllable video-to-music generation.
Experience
Huawei Noah's Ark Lab
Research internship on Model Merging (Expert Merging)[1].
Tencent
Work on Game Character Material Generation with animation
Huawei Noah's Ark Lab
2025.06 - 2025.12Research internship on Model Merging (Expert Merging)[1].
Tencent
2025.04 - 2025.06Work on Game Character Material Generation with animation
Awards
First Zhaoyuan Chengen Technology Innovation Scholarship (Top 1)
2021.12, University-wide Unique AwardFirst-Class Academic Scholarship (Top 5%)
2022.12, 2021-2022 Academic YearFirst-Class Academic Scholarship (Top 5%)
2021.12, 2020-2021 Academic YearFirst-Class Academic Scholarship (Top 5%)
2020.12, 2019-2020 Academic YearFirst Prize in Chongqing, National Electronic Design Contest
2022.01, Chongqing Municipal Education CommissionFirst Prize, TI Cup Electronic Design Contest
2020.11, Chongqing Municipal Education CommissionChongqing Excellent Undergraduate Graduation Thesis
2023.06, Chongqing Municipal Education Commission
First Zhaoyuan Chengen Technology Innovation Scholarship (Top 1)
2021.12, University-wide Unique AwardFirst-Class Academic Scholarship (Top 5%)
2022.12, 2021-2022 Academic YearFirst-Class Academic Scholarship (Top 5%)
2021.12, 2020-2021 Academic YearFirst-Class Academic Scholarship (Top 5%)
2020.12, 2019-2020 Academic YearFirst Prize in Chongqing, National Electronic Design Contest
2022.01, Chongqing Municipal Education CommissionFirst Prize, TI Cup Electronic Design Contest
2020.11, Chongqing Municipal Education CommissionChongqing Excellent Undergraduate Graduation Thesis
2023.06, Chongqing Municipal Education Commission