ModelSwitch

This is the repository for our arxiv paper [2504.00762] Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute.

Some of the data and code are still being organized and will be available soon.

Installation

conda create -n ModelSwitch python=3.10
conda activate ModelSwitch
pip install -r requirements.txt

Quick Start

python src/Model_switch.py \
    --dataset_name "GSM8K" \
    --num_workers 250 \
    --Sampling True \
    --Sampling_Numbers 250\
    --results_sampling 5 \
    --modellist "gpt-4o-mini|gemini-1.5-flash-latest"\
    --ConsistencyThreshold 1  \

Experimental Results

ModelSwitch vs. Self-Consistency

python src/Evaluation.py \
    --Evaluation "MS_SC" \
    --dataset "GSM8K" \
    --budget 16

ModelSwitch vs. Multi-Agent Debate

python src/Evaluation.py \
    --Evaluation "MS_MAD" \
    --dataset "GSM8K"

Combined with Reward Model

python src/Evaluation.py \
    --Evaluation "RM" \
    --dataset "MathBench"

Citation

@article{
    chen2025we,
    title={Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scale Test-Time Compute},
    author={Chen, Jianhao and Xun, Zishuo and Zhou, Bocheng and Qi, Han and Zhang, Qiaosheng and Chen, Yang and Hu, Wei and Qu, Yuzhong and Ouyang, Wanli and Hu, Shuyue},
    journal={arXiv preprint arXiv:2504.00762},
    year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
.idea		.idea
Datasets		Datasets
Figs		Figs
Results		Results
script		script
src		src
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ModelSwitch

Installation

Quick Start

Experimental Results

ModelSwitch vs. Self-Consistency

ModelSwitch vs. Multi-Agent Debate

Combined with Reward Model

Citation

About

Uh oh!

Releases

Packages

Languages

akan/ModelSwitch

Folders and files

Latest commit

History

Repository files navigation

ModelSwitch

Installation

Quick Start

Experimental Results

ModelSwitch vs. Self-Consistency

ModelSwitch vs. Multi-Agent Debate

Combined with Reward Model

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages