Skip to content
View xmu-xiaoma666's full-sized avatar

Block or report xmu-xiaoma666

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
xmu-xiaoma666/README.md

Hi there, I'm Yiwei Ma (马祎炜) 👋

Algorithm Engineer @ dots, Xiaohongshu (RED)  ·  Ph.D. from MAC Lab, Xiamen University
Multimodal Large Language Models 🤖  ·  Text-to-Image Pretraining 🎨

Homepage Google Scholar Email Zhihu Profile Views Open Source Love


👨‍💻 About Me

🔥 Latest News

  • 2026 — Two papers accepted by IJCV; one by ACL 2026 (Findings); one by Pattern Recognition.
  • 2025 — One paper accepted by IEEE TPAMI; one by ACM MM 2025.

🏆 Selected Honors

  • 🥇 2026 Top-Talent Program Offers (9): Xiaohongshu Red Star · Tencent Qingyun · Tongyi Alibaba Star · ByteDance Jindouyun · Ant Star · Huawei Genius Youth · Meituan Beidou · Xiaomi Top Talent · JD TGT
  • 🧪 NSFC Youth Student Basic Research Project — Principal Investigator (国自然青基), 2024
  • 🚀 CAST Young Talent Support Project for Ph.D. Students (青托), 2025
  • 🎖️ Baidu Scholarship — Global Top 40, 2024
  • 🏅 National Scholarship ×3 (2019 · 2022 · 2024)

📝 Selected Publications

Full list on my homepage →

  • An Extensive Benchmark for Single-Round and Multi-Round Instruction-Based Image EditingIJCV 2026 [Code]
  • CoP: Chain of Perception for Referring 3D Instance SegmentationIJCV 2026 [Code]
  • Boosting Multi-Modal Large Language Model with Enhanced Visual FeaturesTPAMI 2025 [Code]
  • I2EBench: A Comprehensive Benchmark for Instruction-based Image EditingNeurIPS 2024 [Code]
  • X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar GenerationICML 2024 [Project]
  • X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual GuidanceICCV 2023 [Project]
  • Towards Local Visual Modeling for Image CaptioningPattern Recognition 2023 🏆 ESI Highly Cited [Code]
  • X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text RetrievalACM MM 2022 🔥 500+ citations [Code]

🚀 Open-Source Projects

  • 🤖 dots.vlm1.inst — Instruction-tuned multimodal LLM from the dots series (Xiaohongshu · dots)
  • 📄 dots.mocr — Multilingual document layout parsing & OCR model (Xiaohongshu · dots)
  • External-Attention-pytorch — PyTorch implementations of Attention / MLP / Re-param / Conv modules (12k+ stars)

✍️ Writing & Community

I share paper reading notes and tutorials on 知乎 (Zhihu) and my WeChat public account FightingCV.

📖 Selected articles

Popular repositories Loading

  1. External-Attention-pytorch External-Attention-pytorch Public

    🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

    Python 12.2k 1.9k

  2. FightingCV-Paper-Reading FightingCV-Paper-Reading Public

    ⭐⭐⭐FightingCV Paper Reading, which helps you understand the most advanced research work in an easier way 🍀 🍀 🍀

    Shell 820 89

  3. X-Dreamer X-Dreamer Public

    A pytorch implementation of “X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation”

    Python 75 3

  4. xmu-xiaoma666 xmu-xiaoma666 Public

    35 4

  5. RepMLP-pytorch RepMLP-pytorch Public

    Pytorch implement ion of RepMLP

    Python 30 8

  6. LSTNet LSTNet Public

    Towards Local Visual Modeling for Image Captioning

    Python 30 7