Stars
Extrapolating RLVR to General Domains without Verifiers
An Open-source RL System from ByteDance Seed and Tsinghua AIR
Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning
[CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness
A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone
React Native font SimSun <宋体> SimHei <黑体> KaiTi<楷体> , support iOS and Android both.
[CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback