PRIME-RL

P1 Public

P1: Mastering Physics Olympiads with Reinforcement Learning

SimpleVLA-RL Public

[ICLR 2026] SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

Python 1.7k 112

Entropy-Mechanism-of-RL Public

The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.

Python 442 15

RL-Compositionality Public

FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones

Python 68 7

TTRL Public

[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning

Python 1.1k 83

PRIME Public

Scalable RL solution for advanced reasoning of language models

Python 1.9k 112

Provide feedback