xmu-xiaoma666

xmu-xiaoma666 xmu-xiaoma666

Ph.D student of MAC Lab, Xiamen University. Intern of MinD (Machine IntelligeNce of Damo) Lab, @alibaba.

1.1k followers · 48 following

Xiamen University
Xiamen, China
https://xmu-xiaoma666.github.io/

Achievements

Lists (7)

Sort

Starred repositories

anthropics / claude-code

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…

Shell 111,116 18,522 Updated Apr 8, 2026

iDC-NEU / Greator

C++ 21 Updated Aug 30, 2025

meituan-longcat / LongCat-Flash-Thinking

285 30 Updated Apr 2, 2026

roboflow / roboflow-100-benchmark

Code for replicating Roboflow 100 benchmark results and programmatically downloading benchmark datasets

Jupyter Notebook 289 30 Updated Oct 26, 2024

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 75,729 15,336 Updated Apr 8, 2026

microsoft / GLIP

Grounded Language-Image Pre-training

Python 2,581 216 Updated Jan 24, 2024

jefferyZhan / Griffon

Official repo of Griffon series including v1(ECCV 2024), v2(ICCV 2025), G, and R, and also the RL tool Vision-R1.

Python 250 12 Updated Aug 12, 2025

IDEA-Research / ChatRex

Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding

Python 212 9 Updated Oct 15, 2025

rednote-hilab / dots.ocr

Multilingual Document Layout Parsing in a Single Vision-Language Model

Python 8,189 737 Updated Mar 24, 2026

rednote-hilab / dots.vlm1

The official repository of the dots.vlm1 instruct models proposed by rednote-hilab.

Dockerfile 285 8 Updated Sep 26, 2025

InternLM / Intern-S1

A Scientific Multimodal Foundation Model

765 40 Updated Mar 27, 2026

lxtGH / DenseWorld-1M

Code and dataset link for "DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World"

128 2 Updated Oct 2, 2025

VisionXLab / mllm-mmrotate

[IGARSS 2025 Oral] A Simple Aerial Detection Baseline of Multimodal Language Models.

Jupyter Notebook 93 6 Updated Feb 12, 2026

open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark

Python 32,586 9,850 Updated Aug 21, 2024

linkangheng / PR1

[NeurIPS 2025] Official code implementation of Perception R1: Pioneering Perception Policy with Reinforcement Learning

Python 291 12 Updated Jul 15, 2025

2U1 / Qwen-VL-Series-Finetune

An open-source implementaion for fine-tuning Qwen-VL series by Alibaba Cloud.

Python 1,802 207 Updated Mar 25, 2026

PaddlePaddle / ERNIE

The official repository for ERNIE 4.5 and ERNIEKit – its industrial-grade development toolkit based on PaddlePaddle.

Python 7,699 1,451 Updated Jan 4, 2026

JierunChen / Ref-L4

Evaluation code for Ref-L4, a new REC benchmark in the LMM era

Python 59 1 Updated Dec 28, 2024

JIA-Lab-research / VisionReasoner

VisionReasoner: Unified Reasoning-Integrated Visual Perception via Reinforcement Learning

Python 332 15 Updated Feb 9, 2026

sydai / referring-expression-counting

Python 28 5 Updated Feb 21, 2025

niki-amini-naieni / CountGD

Includes the code for training and testing the CountGD model from the paper CountGD: Multi-Modal Open-World Counting.

Python 310 37 Updated Jun 25, 2025

linhuixiao / Awesome-Visual-Grounding

[TPAMI 2025] Towards Visual Grounding: A Survey

Shell 302 26 Updated Nov 18, 2025

rednote-hilab / dots.llm1

The official repository of the dots.llm1 base and instruct models proposed by rednote-hilab.

489 25 Updated Aug 20, 2025

PaddlePaddle / PaddleX

All-in-One Development Tool based on PaddlePaddle

Python 6,094 1,178 Updated Apr 1, 2026

Visual-Agent / DeepEyes

Python 1,187 73 Updated Nov 20, 2025

ByteDance-Seed / Seed1.5-VL

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,568 65 Updated Jun 14, 2025