feat(env): add mujoco-warp support by rainstormstudio · Pull Request #1096 · RLinf/RLinf

rainstormstudio · 2026-04-24T21:16:12Z

This PR adds MuJoCo-Warp environments to RLinf, providing GPU-accelerated physics simulation using the mujoco-warp.

Description

Two envirionments are currently included:

CartPole (mujoco_warp_cartpole_env.py): classic cart-pole balancing with GPU-accelerated simulation
CubePick (mujoco_warp_cubepick_env.py): Franka Emika Panda robot arm picking a cube

Note: This is a work-in-progress PR. The current environments still need refactoring into a unified mujoco_warp_env base, and more tasks need to be added for OpenVLA.

Motivation and Context

Support GPU-accelerated physics simulation within RLinf with mujoco-warp.

How has this been tested?

Environment episodes run successfully with random actions and trained policies
PPO+MLP training converges on both CartPole and CubePick tasks

Roadmap

CartPole PPO MLP successfully trains
CubePick PPO MLP successfully trains
refactor into a unified mujoco_warp_env base for general use
CubePick OpenVLA successfully trains

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Documentation update (Document-only update)
Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

My code follows the code style of this project.
My change requires a change to the documentation.
I have updated the documentation accordingly.
I have added tests to cover my changes.
All new and existing tests passed.

Iron-Wph · 2026-04-26T06:24:12Z

@rainstormstudio
Hello, thank you for your contribution. Below are some suggestions for improvement:

Your work introduces a new feature. Please add corresponding documentation in both Chinese and English, including environment setup, configuration details, usage instructions, and experimental results.
Please remove the assets from this repository and require users to download them manually instead of including them in the PR. Correspondingly, provide clear instructions in the documentation on how to download these assets.
Please refer to the following workflow to configure end-to-end tests:
https://github.com/RLinf/RLinf/blob/main/.github/workflows/embodied-e2e-tests.yml
Also, add a corresponding test YAML file under tests/e2e_tests/embodied in this repository for minimal validation.
If there are training or evaluation results, please include the corresponding result curves or visualizations in the PR description.
Before submitting new commits, please run pre-commit to perform code checks and ensure code quality.

Extract shared GPU simulation lifecycle, CUDA graph management, rendering, metrics, and auto-reset into a new `MuJoCoWarpEnv` base class. Move CartPole and CubePick task-specific logic into `tasks/cartpole.py` and `tasks/cubepick.py` respectively. Update `__init__.py` registration and entry points to reference the renamed classes.

Introduce `MuJoCoWarpOffloadEnv` that runs the underlying environment in a subprocess so GPU memory can be freed between rollout phases. Extend the base `MuJoCoWarpEnv` with state serialisation hooks and register the offload path in `get_env_cls()` when `enable_offload` is set in config.

envs Move the condition-check for rendering images into a shared `_should_render_obs` / `_maybe_add_render_to_obs` helper, and add a new `render_for_policy` flag under `video_cfg` so observations can be rendered independently of video saving. Update the cartpole and cubepick tasks to use the helper. Add a full OpenVLA PPO config for cubepick with `render_for_policy: True` on both train and eval.

Iron-Wph · 2026-05-08T12:13:04Z

@rainstormstudio Hello, thank you for your contribution. Below are some suggestions for improvement:

Your work introduces a new feature. Please add corresponding documentation in both Chinese and English, including environment setup, configuration details, usage instructions, and experimental results.

Please remove the assets from this repository and require users to download them manually instead of including them in the PR. Correspondingly, provide clear instructions in the documentation on how to download these assets.

Please refer to the following workflow to configure end-to-end tests:
https://github.com/RLinf/RLinf/blob/main/.github/workflows/embodied-e2e-tests.yml
Also, add a corresponding test YAML file under tests/e2e_tests/embodied in this repository for minimal validation.

If there are training or evaluation results, please include the corresponding result curves or visualizations in the PR description.

Before submitting new commits, please run pre-commit to perform code checks and ensure code quality.

@rainstormstudio Hello, could you please fix this problem?

feat(env): add mujoco-warp support with cartpole and cubepick

afb2a6c

rainstormstudio requested review from XuS1994, andylin-hao and guozhen1997 as code owners April 24, 2026 21:16

zoeyuchao assigned Iron-Wph Apr 25, 2026

zoeyuchao added Embodied AI Student Review Community PR labels Apr 25, 2026

rainstormstudio added 4 commits April 27, 2026 11:37

Merge branch 'RLinf:main' into mujoco_warp

40279c4

rainstormstudio marked this pull request as draft April 27, 2026 17:33

Merge branch 'main' into mujoco_warp

7bfdf60

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(env): add mujoco-warp support#1096

feat(env): add mujoco-warp support#1096
rainstormstudio wants to merge 6 commits into
RLinf:mainfrom
rainstormstudio:mujoco_warp

rainstormstudio commented Apr 24, 2026

Uh oh!

Iron-Wph commented Apr 26, 2026

Uh oh!

Iron-Wph commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rainstormstudio commented Apr 24, 2026

Description

Motivation and Context

How has this been tested?

Roadmap

Types of changes

Checklist:

Uh oh!

Iron-Wph commented Apr 26, 2026

Uh oh!

Iron-Wph commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants