Selective Entropy Regularization (SIREN) 🧜‍♀️

This repository contains the official implementation of Selective Entropy Regularization (SIREN), introduced in our paper: Rethinking Entropy Regularization in Large Reasoning Models.

SIREN addresses the issue of entropy collapse in Reinforcement Learning with Verifiable Reasoning (RLVR) when applying naive entropy regularization to large reasoning models. Built upon the veRL framework, our implementation introduces key modifications to entropy computation, aggregation, and the overall training objective.

Installation ⚙️

We recommend creating a clean conda environment to avoid dependency conflicts.

conda create -n siren python=3.10
conda activate siren
pip install -r requirements.txt

# install verl
cd verl
pip install -e .

Usage 🍽️

Prepare data

huggingface-cli download --repo-type dataset --resume-download Elliott/Openr1-Math-46k-8192 --local-dir data

Running

We provide example scripts for both training and evaluation.

# training
bash exp_scripts/siren.sh

# evaluation
bash exp_scripts/eval.sh

The training script (siren.sh) contains default hyperparameters and can be customized according to your experimental setup.

Acknowledgement 🫰

We thank the open-source communities behind the following projects for their valuable contributions:

Frameworks: veRL, vLLM , Math-Verify
Datasets: MATH, NuminaMath, OpenR1-Math-220k
Backbones: Qwen2.5-Math, Llama-3.1

Citation 📜

If you find our work useful in your research, please consider citing:

@misc{jiang2025rethinkingentropyregularizationlarge,
      title={Rethinking Entropy Regularization in Large Reasoning Models}, 
      author={Yuxian Jiang and Yafu Li and Guanxu Chen and Dongrui Liu and Yu Cheng and Jing Shao},
      year={2025},
      eprint={2509.25133},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2509.25133}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
eval		eval
exp_scripts		exp_scripts
verl		verl
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Selective Entropy Regularization (SIREN) 🧜‍♀️

Installation ⚙️

Usage 🍽️

Prepare data

Running

Acknowledgement 🫰

Citation 📜

About

Uh oh!

Releases

Packages

Languages

Linn3a/siren

Folders and files

Latest commit

History

Repository files navigation

Selective Entropy Regularization (SIREN) 🧜‍♀️

Installation ⚙️

Usage 🍽️

Prepare data

Running

Acknowledgement 🫰

Citation 📜

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages