safe-control-gym

Physics-based CartPole and Quadrotor Gym environments (using PyBullet) with symbolic a priori dynamics (using CasADi) for learning-based control, and model-free and model-based reinforcement learning (RL).

These environments include (and evaluate) symbolic safety constraints and implement input, parameter, and dynamics disturbances to test the robustness and generalizability of control approaches. [PDF]

@article{brunke2021safe,
         title={Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning},
         author={Lukas Brunke and Melissa Greeff and Adam W. Hall and Zhaocong Yuan and Siqi Zhou and Jacopo Panerati and Angela P. Schoellig},
         journal = {Annual Review of Control, Robotics, and Autonomous Systems},
         year={2021},
         url = {https://arxiv.org/abs/2108.06266}}

To reproduce the results in the article, see branch ar.

@article{yuan2021safecontrolgym,
  author={Yuan, Zhaocong and Hall, Adam W. and Zhou, Siqi and Brunke, Lukas and Greeff, Melissa and Panerati, Jacopo and Schoellig, Angela P.},
  journal={IEEE Robotics and Automation Letters},
  title={Safe-Control-Gym: A Unified Benchmark Suite for Safe Learning-Based Control and Reinforcement Learning in Robotics},
  year={2022},
  volume={7},
  number={4},
  pages={11142-11149},
  doi={10.1109/LRA.2022.3196132}}

To reproduce the results in the article, see branch submission.

Install on Ubuntu/macOS

Clone repo

git clone https://github.com/utiasDSL/safe-control-gym.git
cd safe-control-gym

(optional) Create a `conda` environment

Create and access a Python 3.10 environment using conda

conda create -n safe python=3.10
conda activate safe

Install

Install the safe-control-gym repository

python -m pip install --upgrade pip
python -m pip install -e .

Note

You may need to separately install gmp, a dependency of pycddlib:

conda install -c anaconda gmp

or

sudo apt-get install libgmp-dev

(optional) Additional requirements for MPC

You may need to separately install acados for fast MPC implementations.

To build and install acados, see their installation guide.
To set up the acados python interface, check out these installtion steps.

Architecture

Overview of safe-control-gym's API:

Configuration

Getting Started

Familiarize with APIs and environments with the scripts in examples/

3D Quadrotor Lemniscate Trajectory Tracking with PID

cd ./examples/   # Navigate to the examples folder
python3 pid/pid_experiment.py \
    --algo pid \
    --task quadrotor \
    --overrides \
        ./pid/config_overrides/quadrotor_3D/quadrotor_3D_tracking.yaml

Cartpole Stabilization with LQR

cd ./examples/   # Navigate to the examples folder
python3 lqr/lqr_experiment.py \
    --algo lqr \
    --task cartpole \
    --overrides \
        ./lqr/config_overrides/cartpole/cartpole_stabilization.yaml \
        ./lqr/config_overrides/cartpole/lqr_cartpole_stabilization.yaml

2D Quadrotor Trajectory Tracking with PPO

cd ./examples/rl/   # Navigate to the RL examples folder
python3 rl_experiment.py \
    --algo ppo \
    --task quadrotor \
    --overrides \
        ./config_overrides/quadrotor_2D/quadrotor_2D_track.yaml \
        ./config_overrides/quadrotor_2D/ppo_quadrotor_2D.yaml \
    --kv_overrides \
        algo_config.training=False

Verbose API Example

cd ./examples/   # Navigate to the examples folder
python3 no_controller/verbose_api.py \
    --task cartpole \
    --overrides no_controller/verbose_api.yaml

List of Implemented Controllers

List of Implemented Safety Filters

Performance

We compare the sample efficiency of safe-control-gym with the original OpenAI Cartpole and PyBullet Gym's Inverted Pendulum, as well as gym-pybullet-drones. We choose the default physic simulation integration step of each project. We report performance results for open-loop, random action inputs. Note that the Bullet engine frequency reported for safe-control-gym is typically much finer grained for improved fidelity. safe-control-gym quadrotor environment is not as light-weight as gym-pybullet-drones but provides the same order of magnitude speed-up and several more safety features/symbolic models.

Environment	GUI	Control Freq.	PyBullet Freq.	Constraints & Disturbances^	Speed-Up^^
Gym cartpole	True	50Hz	N/A	No	1.16x
InvPenPyBulletEnv	False	60Hz	60Hz	No	158.29x
cartpole	True	50Hz	50Hz	No	0.85x
cartpole	False	50Hz	1000Hz	No	24.73x
cartpole	False	50Hz	1000Hz	Yes	22.39x

gym-pyb-drones	True	48Hz	240Hz	No	2.43x
gym-pyb-drones	False	50Hz	1000Hz	No	21.50x
quadrotor	True	60Hz	240Hz	No	0.74x
quadrotor	False	50Hz	1000Hz	No	9.28x
quadrotor	False	50Hz	1000Hz	Yes	7.62x

^ Whether the environment includes a default set of constraints and disturbances

^^ Speed-up = Elapsed Simulation Time / Elapsed Wall Clock Time; on a 2.30GHz Quad-Core i7-1068NG7 with 32GB 3733MHz LPDDR4X; no GPU

Run Tests and Linting

Tests can be run locally by executing:

python3 -m pytest ./tests/  # Run all tests

Linting can be run locally with:

pre-commit install  # Install the pre-commit hooks
pre-commit autoupdate  # Auto-update the version of the hooks
pre-commit run --all  # Run the hooks on all files

References

Brunke, L., Greeff, M., Hall, A. W., Yuan, Z., Zhou, S., Panerati, J., & Schoellig, A. P. (2022). Safe learning in robotics: From learning-based control to safe reinforcement learning. Annual Review of Control, Robotics, and Autonomous Systems, 5, 411-444.
Yuan, Z., Hall, A. W., Zhou, S., Brunke, L., Greeff, M., Panerati, J., & Schoellig, A. P. (2022). safe-control-gym: A unified benchmark suite for safe learning-based control and reinforcement learning in robotics. IEEE Robotics and Automation Letters, 7(4), 11142-11149.

Related Open-source Projects

gym-pybullet-drones: single and multi-quadrotor environments
stable-baselines3: PyTorch reinforcement learning algorithms
bullet3: multi-physics simulation engine
gym: OpenAI reinforcement learning toolkit
casadi: symbolic framework for numeric optimization
safety-gym: environments for safe exploration in RL
realworldrl_suite: real-world RL challenge framework
gym-marl-reconnaissance: multi-agent heterogeneous (UAV/UGV) environments

University of Toronto's Dynamic Systems Lab / Vector Institute for Artificial Intelligence

Name		Name	Last commit message	Last commit date
Latest commit History 519 Commits
.github/workflows		.github/workflows
examples		examples
figures		figures
safe_control_gym		safe_control_gym
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

safe-control-gym

Install on Ubuntu/macOS

Clone repo

(optional) Create a `conda` environment

Install

Note

(optional) Additional requirements for MPC

Architecture

Configuration

Getting Started

3D Quadrotor Lemniscate Trajectory Tracking with PID

Cartpole Stabilization with LQR

2D Quadrotor Trajectory Tracking with PPO

Verbose API Example

List of Implemented Controllers

List of Implemented Safety Filters

Performance

Run Tests and Linting

References

Related Open-source Projects

About

Releases 4

Packages

Contributors 13

Languages

License

utiasDSL/safe-control-gym

Folders and files

Latest commit

History

Repository files navigation

safe-control-gym

Install on Ubuntu/macOS

Clone repo

(optional) Create a conda environment

Install

Note

(optional) Additional requirements for MPC

Architecture

Configuration

Getting Started

3D Quadrotor Lemniscate Trajectory Tracking with PID

Cartpole Stabilization with LQR

2D Quadrotor Trajectory Tracking with PPO

Verbose API Example

List of Implemented Controllers

List of Implemented Safety Filters

Performance

Run Tests and Linting

References

Related Open-source Projects

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 4

Packages 0

Contributors 13

Languages

(optional) Create a `conda` environment

Packages