Implementation for the paper "StraTA: Incentivizing Agentic Reinforcement Learning with Strategic Trajectory Abstraction".
In this work, we present Strategic Trajectory Abstraction (StraTA), a simple framework that introduces an explicit trajectory-level strategy into agentic reinforcement learning (RL). StraTA samples a compact strategy from the initial task state, conditions subsequent actions on that strategy, and trains strategy generation and action execution jointly with a hierarchical GRPO-style rollout design, further enhanced by diverse strategy rollout and critical self-judgment.
Our implementation is based on the rLLM framework. You can follow the tutorial to setup the framework.
Our implementation supports three environments based on the AgentGym codebase. You should first setup the environment and prepare the dataset. Refer to following tutorials for usage:
- ALFWorld: a text-based embodied household environment.
- WebShop: a web-based online shopping environment.
- SciWorld: a text-based scientific experimentation environment.
The model checkpoints are available on HuggingFace. You can download them for evaluation.
Please consider citing our paper if you find it helpful:
@article{xue2026strata,
title={StraTA: Incentivizing Agentic Reinforcement Learning with Strategic Trajectory Abstraction},
author={Xue, Xiangyuan and Zhou, Yifan and Wang, Zidong and Tang, Shengji and Torr, Philip and Ouyang, Wanli and Bai, Lei and Yin, Zhenfei},
journal={arXiv preprint arXiv:2605.06642},
year={2026}
}