Skip to content

Dtc7w3PQ/PRCO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PRCO Logo

🔭 Seeing with You: Perception-Reasoning Co-evolution for Multimodal Reasoning

Official implementation of Seeing with You: Perception-Reasoning Co-evolution for Multimodal Reasoning

Paper Hugging Face Collection GitHub Stars

If you find our project helpful, please consider giving us a star ⭐ on GitHub!

Overview

PRCO is a dual-role reinforcement learning with verifiable rewards (RLVR) framework for multimodal reasoning.

  • Observer: extracts question-relevant visual facts from the image and produces a question-conditioned evidence caption.
  • Solver: predicts the final answer from the caption, optionally consulting the image when needed.

PRCO Overview

Figure: Overview of PRCO.


🚀 News

  • [2026/03/25] We've released the model checkpoints, and evaluation code for PRCO.

TODO List

  • Release PRCO checkpoints (3B / 7B / 8B)
  • Release evaluation code
  • Release paper.
  • Release training code

Highlights

  • Dual-role RLVR framework for multimodal reasoning with a shared policy.
  • Observer/Solver decomposition for explicit separation of perception and reasoning.
  • Reliable role-specific rewards for better gradient-level credit assignment.
  • Consistent gains across model scales, including strong improvements on both 3B and 7B backbones.
  • Broad benchmark coverage across visual math, geometry, logic, and multidisciplinary reasoning.

Benchmark Results

Main Results on 8 Benchmarks (7B)

Model MathVerse MathVision MathVista WeMath DynaMath LogicVista MMMU-Pro MMStar Avg.
Qwen2.5-VL-7B 43.02 25.46 70.20 35.43 20.35 45.41 35.49 64.26 42.45
DAPO 48.73 29.30 74.80 45.62 26.14 47.87 41.38 65.40 47.41
PRCO-7B 49.49 30.86 77.10 50.29 29.74 49.66 42.08 67.80 49.63

Model Zoo

Model Backbone Status Link
PRCO-3B Qwen2.5-VL-3B Released Checkpoint
PRCO-7B Qwen2.5-VL-7B Released Checkpoint
PRCO-8B Qwen3-VL-8B Released Checkpoint

Usage

1. Installation

git clone https://github.com/Dtc7w3PQ/PRCO.git
cd PRCO
conda create -n prco python=3.12 -y
conda activate prco
pip install -r requirements.txt

2. Evaluation

PRCO-3B / PRCO-7B / PRCO-8B use the same evaluation workflow.

  1. Fill environment variables in VLMEvalKit/.env:
LMUData="<PATH_TO_LMUDATA>"
OPENAI_API_KEY="<YOUR_OPENAI_API_KEY>"
OPENAI_API_BASE="<YOUR_OPENAI_API_BASE>"  # optional

Then load them in your shell:

set -a
source VLMEvalKit/.env
set +a
  1. Set the correct local checkpoint path in each model config (both observer.model_path and solver.model_path):
  • VLMEvalKit/scripts/prco_3b/config.json
  • VLMEvalKit/scripts/prco_7b/config.json
  • VLMEvalKit/scripts/prco_8b/config.json
  1. Run inference and evaluation scripts.

Example for one model (prco_7b):

cd VLMEvalKit
bash scripts/prco_7b/infer.sh
bash scripts/prco_7b/eval.sh

Run all three models with the same pipeline:

cd VLMEvalKit
for m in prco_3b prco_7b prco_8b; do
  bash scripts/$m/infer.sh
  bash scripts/$m/eval.sh
done

Logs are written to:

  • VLMEvalKit/scripts/<model_name>/infer.log
  • VLMEvalKit/scripts/<model_name>/eval.log

Predictions and evaluation outputs are written under:

  • VLMEvalKit/outputs/<model_name>/

3. Training

Train PRCO with the dual-role Observer/Solver framework:

Coming soon...

Citation

If you find this project helpful, please cite our paper:

@article{miao2026seeing,
  title={Seeing with You: Perception-Reasoning Coevolution for Multimodal Reasoning},
  author={Miao, Ziqi and Jia, Haonan and Li, Lijun and Qian, Chen and Xiong, Yuan and Yan, Wenting and Shao, Jing},
  journal={arXiv preprint arXiv:2603.28618},
  year={2026}
}

Acknowledgement

This project is built around open multimodal reasoning research. We especially thank the open-source communities behind vLLM, EasyR1, and verl, which made this work possible.

About

Official implementation of Seeing with You: Perception-Reasoning Co-evolution for Multimodal Reasoning.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages