GitHub - SakanaAI/repo: RePo: Language Models with Context Re-Positioning

RePo: Language Models with Context Re-Positioning

An light-weight module that allows LLMs to re-structure the context adaptively.

Table of Contents

🔥 News

[2025.12] Our demo is now running on huggingface/spaces.
[2025.12] We add an interactive demo to visualize the assigned positions. Please find it in ./visual!
[2025.12] We have released the training code and evaluation scripts!
[2025.12] Pre-trained models (based on OLMo-2 1B) are now available on Hugging Face.
[2025.12] The paper "RePo: Language Models with Context Re-Positioning" is released on arXiv.

🧩 Abstract

In-context learning is fundamental to modern Large Language Models (LLMs); however, prevailing architectures impose a rigid and fixed contextual structure by assigning linear or constant positional indices. Drawing on Cognitive Load Theory (CLT), we argue that this uninformative structure increases extraneous cognitive load, consuming finite working memory capacity that should be allocated to deep reasoning and attention allocation. To address this, we propose RePo, a novel mechanism that reduces extraneous load via context re-positioning. Unlike standard approaches, RePo utilizes a differentiable module, $f_\phi$, to assign token positions that capture contextual dependencies, rather than replying on pre-defined integer range. By continually pre-training on the OLMo-2 1B backbone, we demonstrate that RePo significantly enhances performance on tasks involving noisy contexts, structured data, and longer context length, while maintaining competitive performance on general short-context tasks. Detailed analysis reveals that RePo successfully allocate higher attention to distant but relevant information, assign positions in dense and non-linear space, and capture the intrinsic structure of the input context.

This is the initial repository for the research project RePo. Please feel free to open issues if you have any questions or find any mistakes.

🛠️ Installation

Clone the repository

git clone https://github.com/SakanaAI/repo
cd repo

Setup for Evaluation

# We tested this setup on H100 and 6000Ada
# in ./repo
conda create -n olmes python=3.11

### Important: enable only if you have CUDA > 12.4, this is critical for the compile of vLLM
# conda install -c nvidia/label/cuda-12.4.0 cuda-toolkit
# pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu124

### install torch
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0

### install vLLM with RePo
cd vllm
python use_existing_torch.py
pip install -r requirements/build.txt
mkdir -p vllm/vllm_flash_attn
pip install -e . --no-build-isolation

### install transformers with RePo
cd ../transformers
pip install -e '.[torch]' --no-build-isolation

### install test suites
cd ../olmes
pip install -e . --no-build-isolation

Setup for Train

# We tested this setup on H100
# in ./repo
cd OLMo

### install OLMo with RePo
conda env create -f environment.yml
conda activate olmo
pip install flash-attn==2.7.4.post1
pip install -e .[all]

💻 Usage

Quick Inference

Please download the checkpoints from huggingface in adavance:

cd olmes
bash eval_ruler.sh

🏋️ Training

Please take a look at the script OLMo/batch_run_stage2_1b.sh, you need to replace the placeholder to the state-2 data by your real data path, following the instruction of OLMo.

cd OLMo
SLURM_ARRAY_TASK_ID=2 bash batch_run_stage2_1b.sh -d $YOUR_DATA_DIR

📜 Citation

If you find this project useful, please cite our paper:

@article{sakana2025repo,
  title={RePo: Language Models with Context Re-Positioning},
  author={Huayang Li, Tianyu Zhao, and Richard Sproat},
  year={2025},
  eprint={2512.14391},
  archivePrefix={arXiv},
  primaryClass={cs.LG},
  url={https://arxiv.org/abs/2512.14391},
}

🙏 Acknowledgments

We utilized code from OLMo, olmes, vLLM, and transformers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RePo: Language Models with Context Re-Positioning

🔥 News

🧩 Abstract

🛠️ Installation

💻 Usage

Quick Inference

🏋️ Training

📜 Citation

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
OLMo		OLMo
olmes		olmes
transformers		transformers
visual		visual
vllm		vllm
README.md		README.md

SakanaAI/repo

Folders and files

Latest commit

History

Repository files navigation

RePo: Language Models with Context Re-Positioning

🔥 News

🧩 Abstract

🛠️ Installation

💻 Usage

Quick Inference

🏋️ Training

📜 Citation

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages