Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM-5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...)…

Python 13,497 1,318 Updated Apr 2, 2026

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)

Python 9,296 910 Updated Mar 30, 2026

ByteDance-Seed / Bagel

Open-source unified multimodal model

Python 5,782 512 Updated Oct 27, 2025

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 4,799 365 Updated Mar 26, 2026

StarsfieldAI / R1-V

Witness the aha moment of VLM with less than $3.

Python 4,045 286 Updated May 19, 2025

jd-opensource / OxyGent

Multi-agent collaboration framework

Python 1,901 275 Updated Apr 2, 2026

NVlabs / GDPO

Official implementation of GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Python 432 29 Updated Feb 17, 2026

RUC-NLPIR / Tool-Star

🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning

Python 335 22 Updated Jan 3, 2026

UMass-Embodied-AGI / Mirage

[CVPR 2026] Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens

Python 260 18 Updated Aug 2, 2025

NOVAglow646 / Monet

[CVPR 2026] Official codes of "Monet: Reasoning in Latent Visual Space Beyond Image and Language"

Python 159 2 Updated Mar 19, 2026

shijian2001 / Video-Thinker

Sparking "Thinking with Videos" via Reinforcement Learning

Python 152 6 Updated Oct 30, 2025

Hui-design / TSPO

[AAAI 2026] ✨ TSPO: Temporal Sampling Policy Optimization for Long-form Video Language Understanding

Python 122 11 Updated Nov 12, 2025

xinyan-cxy / MINT-CoT

[NeurIPS 2025] MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning

Python 103 5 Updated Sep 19, 2025

genvidbench / GenVidBench

【AAAI 2026】GenVidBench: A 6-Million Benchmark for AI-Generated Video Detection

Python 76 2 Updated Mar 13, 2026

shiwk24 / MathCanvas

This is the official repository for the paper "MathCanvas: Intrinsic Visual Chain-of-Thought for Multimodal Mathematical Reasoning"

Python 67 3 Updated Dec 29, 2025

QiWang98 / VideoRFT

[NeurIPS 2025] VideoRFT: Incentivizing Video Reasoning Capability in MLLMs via Reinforced Fine-Tuning

Python 66 2 Updated Jan 6, 2026

hwanyu112 / Latent-Sketchpad

Python 66 2 Updated Feb 1, 2026

ICTMCG / FakingRecipe

Official Repository for "FakingRecipe: Detecting Fake News on Short Video Platforms from the Perspective of Creative Process", ACM MM 2024

Python 61 6 Updated Oct 5, 2025

zfr00 / Fact-R1

the official code for Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning

Python 41 Updated Nov 26, 2025

ekonwang / GeometryZero

GeometryZero: Improving Geometry Solving for LLM with Group Contrastive Policy Optimization

Python 9 Updated Sep 1, 2025

Ethylyikes / DAE

[Information Fusion] Official Implementation of DAE (Bridging Cognition and Emotion: Empathy-Driven Multimodal Misinformation Detection)

Python 6 Updated Feb 14, 2026

Ethylyikes / LatentGeo

Official Implementation of LatentGeo: Learnable Auxiliary Constructions in Latent Space for Multimodal Geometric Reasoning

Python 2 Updated Mar 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ethylyikes

Block or report Ethylyikes

Stars

hiyouga / LlamaFactory

unslothai / unsloth

agno-agi / agno

recommenders-team / recommenders

verl-project / verl

modelscope / ms-swift