xmu-xiaoma666 xmu-xiaoma666

Hi there, I'm Yiwei Ma (马祎炜) 👋

Algorithm Engineer @ dots, Xiaohongshu (RED) · Ph.D. from MAC Lab, Xiamen University
Multimodal Large Language Models 🤖 · Text-to-Image Pretraining 🎨

👨‍💻 About Me

🔬 I'm an Algorithm Engineer at the dots team of Xiaohongshu (RED), working on Multimodal Large Language Models and Text-to-Image Pretraining.
🎓 I received my Ph.D. from the Department of Artificial Intelligence, Xiamen University (MAC Lab), advised by Prof. Rongrong Ji and Prof. Xiaoshuai Sun.
📚 27 papers in CCF-A/B venues (17 as first/co-first author, 3 Orals), with 1500+ Google Scholar citations.
⭐ Core developer of External-Attention-pytorch (12k+ stars).
📫 Reach me at mayiwei1998@163.com — feel free to chat!

🔥 Latest News

2026 — Two papers accepted by IJCV; one by ACL 2026 (Findings); one by Pattern Recognition.
2025 — One paper accepted by IEEE TPAMI; one by ACM MM 2025.

🏆 Selected Honors

🥇 2026 Top-Talent Program Offers (9): Xiaohongshu Red Star · Tencent Qingyun · Tongyi Alibaba Star · ByteDance Jindouyun · Ant Star · Huawei Genius Youth · Meituan Beidou · Xiaomi Top Talent · JD TGT
🧪 NSFC Youth Student Basic Research Project — Principal Investigator (国自然青基), 2024
🚀 CAST Young Talent Support Project for Ph.D. Students (青托), 2025
🎖️ Baidu Scholarship — Global Top 40, 2024
🏅 National Scholarship ×3 (2019 · 2022 · 2024)

📝 Selected Publications

Full list on my homepage →

An Extensive Benchmark for Single-Round and Multi-Round Instruction-Based Image Editing — IJCV 2026 [Code]
CoP: Chain of Perception for Referring 3D Instance Segmentation — IJCV 2026 [Code]
Boosting Multi-Modal Large Language Model with Enhanced Visual Features — TPAMI 2025 [Code]
I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing — NeurIPS 2024 [Code]
X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation — ICML 2024 [Project]
X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance — ICCV 2023 [Project]
Towards Local Visual Modeling for Image Captioning — Pattern Recognition 2023 🏆 ESI Highly Cited [Code]
X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval — ACM MM 2022 🔥 500+ citations [Code]

🚀 Open-Source Projects

🤖 dots.vlm1.inst — Instruction-tuned multimodal LLM from the dots series (Xiaohongshu · dots)
📄 dots.mocr — Multilingual document layout parsing & OCR model (Xiaohongshu · dots)
⭐ External-Attention-pytorch — PyTorch implementations of Attention / MLP / Re-param / Conv modules (12k+ stars)

✍️ Writing & Community

I share paper reading notes and tutorials on 知乎 (Zhihu) and my WeChat public account FightingCV.

📖 Selected articles

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xmu-xiaoma666 xmu-xiaoma666

Achievements