This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-based Reasoning MLLMs!

1,251 58 Updated Oct 18, 2025

TsinghuaC3I / Awesome-RL-for-LRMs

A Survey of Reinforcement Learning for Large Reasoning Models

1,983 111 Updated Nov 5, 2025

modelscope / awesome-deep-reasoning

Collect every awesome work about r1!

Python 422 15 Updated May 2, 2025

yaotingwangofficial / Awesome-MCoT

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

875 25 Updated Aug 26, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 15,111 2,424 Updated Nov 5, 2025

ByteDance-Seed / Bagel

Open-source unified multimodal model

Python 5,249 455 Updated Oct 27, 2025

OpenBMB / AgentCPM-GUI

AgentCPM-GUI: An on-device GUI agent for operating Android apps, enhancing reasoning ability with reinforcement fine-tuning for efficient task execution.

Python 1,093 103 Updated Jun 14, 2025

ByteDance-Seed / SAIL

Implementation for "The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer"

Python 68 3 Updated Oct 29, 2025

bytedance / UI-TARS

Python 8,123 571 Updated Nov 5, 2025

apple / ml-cross-entropy

Python 545 52 Updated Sep 23, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 25,609 2,400 Updated Sep 8, 2025

QwenLM / Qwen3-VL

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 15,960 1,255 Updated Oct 27, 2025

QwenLM / Qwen-Agent

Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.

Python 12,215 1,117 Updated Sep 26, 2025

EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.

Python 10,522 2,826 Updated Oct 29, 2025

leobeeson / llm_benchmarks

A collection of benchmarks and datasets for evaluating LLM.

522 30 Updated Jul 13, 2024

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…

Python 10,878 944 Updated Nov 5, 2025

showlab / ROICtrl

Code for [CVPR 2025] ROICtrl: Boosting Instance Control for Visual Generation

Python 109 Updated Apr 16, 2025

showlab / ShowUI

[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.

Python 1,535 108 Updated May 29, 2025

NexaAI / Awesome-LLMs-on-device

Awesome LLMs on Device: A Comprehensive Survey

1,243 109 Updated Jan 12, 2025

alibaba / Pai-Megatron-Patch

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.

Python 1,413 201 Updated Oct 31, 2025

showlab / computer_use_ootb

Out-of-the-box (OOTB) GUI Agent for Windows and macOS

Python 1,816 190 Updated May 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stan Lei StanLei52

Achievements

Achievements

Block or report StanLei52

Stars

lupantech / AgentFlow

zhaochenyang20 / Awesome-ML-SYS-Tutorial

bytedance / SandboxFusion

dvlab-research / ARPO

rllm-org / rllm

hiyouga / EasyR1

huggingface / trl

Alibaba-NLP / DeepResearch

OSU-NLP-Group / GUI-Agents-Paper-List

Sun-Haoyuan23 / Awesome-RL-based-Reasoning-MLLMs