Skip to content

cvoelcker/rsl_rl

 
 

Repository files navigation

RSL-RL ❤️ REPPO

Contains a robotics friendly implementation for the REPPO algorithm


A fast and simple implementation of learning algorithms for robotics. For an overview of the library please have a look at https://arxiv.org/pdf/2509.10771.

Environment repositories using the framework:

The library currently supports PPO and Student-Teacher Distillation with additional features from our research. These include:

We welcome contributions from the community. Please check our contribution guidelines for more information.

Maintainer: Mayank Mittal and Clemens Schwarke, REPPO maintainer: Claas Voelcker
Affiliation: Robotic Systems Lab, ETH Zurich & NVIDIA
Contact: cschwarke@ethz.ch

News

[2026/02/17] Bugs should be cleared and we are releasing some training tipps for REPPO + Isaac [2026/01/30] We found two major bugs in the implementation which are currently being fixed

Setup

The package can be installed via PyPI with:

pip install rsl-rl-lib

or by cloning this repository and installing it with:

git clone https://github.com/leggedrobotics/rsl_rl
cd rsl_rl
pip install -e .

The package supports the following logging frameworks which can be configured through logger:

For a demo configuration of PPO, please check the example_config.yaml file.

REPPO + Isaac Training Tips

Max entropy RL algorithms struggle on some Isaac tasks, as the action space is not bounded by -1 to 1. We have provided some simple utilities for scaling the tanh Gaussian between arbitrary lower and upper action bounds per dimension. This can be accomplished by setting the action lower and upper bounds in the actor. We have found that scaling the actions in the actor instead of scaling them in the environment wrapper leads to more stable training.

It is also possible to use a non-squashed Normal distribution. In this case, the initial temperature alpha needs to be set lower to prevent the entropy bonus term from diverging. In addition, the reward is tuned for significantly smaller step sizes, as the standard RSL-RL PPO implementation constrains the KL deiation per update iteration to 0.01. Setting the desired_kl parameter for REPPO to this value can stabilize learning, especially when using non-squashed distributions.

Finally, we observed that the curricula used in some Isaac environments have a strong influence on REPPO performance. In general, REPPO learns a strong policy much faster than PPO, but struggles to adapt to the distribution shift introduced by the curriculum.

Contribution Guidelines

For documentation, we adopt the Google Style Guide for docstrings. Please make sure that your code is well-documented and follows the guidelines.

We use the following tools for maintaining code quality:

  • pre-commit: Runs a list of formatters and linters over the codebase.
  • ruff: An extremely fast Python linter and code formatter, written in Rust.

Please check here for instructions to set these up. To run over the entire repository, please execute the following command in the terminal:

# for installation (only once)
pre-commit install
# for running
pre-commit run --all-files

Citing

If you use this library for your research, please cite the following work:

@article{schwarke2025rslrl,
  title={RSL-RL: A Learning Library for Robotics Research},
  author={Schwarke, Clemens and Mittal, Mayank and Rudin, Nikita and Hoeller, David and Hutter, Marco},
  journal={arXiv preprint arXiv:2509.10771},
  year={2025}
}

If you use the library with curiosity-driven exploration (random network distillation), please cite:

@InProceedings{schwarke2023curiosity,
  title = 	 {Curiosity-Driven Learning of Joint Locomotion and Manipulation Tasks},
  author =       {Schwarke, Clemens and Klemm, Victor and Boon, Matthijs van der and Bjelonic, Marko and Hutter, Marco},
  booktitle = 	 {Proceedings of The 7th Conference on Robot Learning},
  pages = 	 {2594--2610},
  year = 	 {2023},
  volume = 	 {229},
  series = 	 {Proceedings of Machine Learning Research},
  publisher =    {PMLR},
  url = 	 {https://proceedings.mlr.press/v229/schwarke23a.html},
}

If you use the library with symmetry augmentation, please cite:

@InProceedings{mittal2024symmetry,
  author={Mittal, Mayank and Rudin, Nikita and Klemm, Victor and Allshire, Arthur and Hutter, Marco},
  booktitle={2024 IEEE International Conference on Robotics and Automation (ICRA)},
  title={Symmetry Considerations for Learning Task Symmetric Robot Policies},
  year={2024},
  pages={7433-7439},
  doi={10.1109/ICRA57147.2024.10611493}
}

About

A fast and simple implementation of learning algorithms for robotics.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%