Skip to content
View yuhangzang's full-sized avatar

Block or report yuhangzang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
yuhangzang/README.md

Yuhang Zang (臧宇航)

Researcher @ Shanghai AI Laboratory

Homepage Google Scholar Email Hugging Face


🔬 Research Interests

Multimodal LLMs

  • Reinforcement Fine-tuning
  • Post-training Optimization
  • Vision-Language Pre-training

Large-Language Models

  • Long-Context LLMs
  • Efficient Tool-Use
  • Reward Models

Evaluation

  • Benchmarks for Long-Context Understanding, 3D spatial Understanding, etc

🏆 Selected Highlights

Year Achievement
2025 #5 in Most Influential ArXiv CV Papers
2025 2nd Place, LSVOS challenge, Complex Video Object Segmentation Track, ICCV 2025
2024 #10 in Most Influential ArXiv CV Papers
2024 #9 in Most Influential NeurIPS Papers
2020 3rd Place, LVIS Challenge, ECCV 2020
2019 1st Place, Google Open Images Challenge, Object Detection Track, ICCV 2019

"Exploring the frontiers of multimodal intelligence"

Pinned Loading

  1. InternLM/InternLM-XComposer InternLM/InternLM-XComposer Public

    InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

    Python 2.9k 177

  2. OV-DETR OV-DETR Public

    [Under preparation] Code repo for "Open-Vocabulary DETR with Conditional Matching" (ECCV 2022)

    Python 236 26

  3. mayubo2333/MMLongBench-Doc mayubo2333/MMLongBench-Doc Public

    Official Repository of MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations

    Python 113 5

  4. ContextDET ContextDET Public

    Contextual Object Detection with Multimodal Large Language Models

    Python 255 13

  5. InternLM/SIM-CoT InternLM/SIM-CoT Public

    An official implementation of "SIM-CoT: Supervised Implicit Chain-of-Thought"

    Python 113 6

  6. InternLM/CapRL InternLM/CapRL Public

    An official implementation of "CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning"

    Python 157 7