Skip to content

feat(env): add mujoco-warp support#1096

Draft
rainstormstudio wants to merge 6 commits into
RLinf:mainfrom
rainstormstudio:mujoco_warp
Draft

feat(env): add mujoco-warp support#1096
rainstormstudio wants to merge 6 commits into
RLinf:mainfrom
rainstormstudio:mujoco_warp

Conversation

@rainstormstudio

Copy link
Copy Markdown

This PR adds MuJoCo-Warp environments to RLinf, providing GPU-accelerated physics simulation using the mujoco-warp.

Description

Two envirionments are currently included:

  • CartPole (mujoco_warp_cartpole_env.py): classic cart-pole balancing with GPU-accelerated simulation
  • CubePick (mujoco_warp_cubepick_env.py): Franka Emika Panda robot arm picking a cube

Note: This is a work-in-progress PR. The current environments still need refactoring into a unified mujoco_warp_env base, and more tasks need to be added for OpenVLA.

Motivation and Context

Support GPU-accelerated physics simulation within RLinf with mujoco-warp.

How has this been tested?

  • Environment episodes run successfully with random actions and trained policies
  • PPO+MLP training converges on both CartPole and CubePick tasks

Roadmap

  • CartPole PPO MLP successfully trains
  • CubePick PPO MLP successfully trains
  • refactor into a unified mujoco_warp_env base for general use
  • CubePick OpenVLA successfully trains

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Documentation update (Document-only update)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@Iron-Wph

Copy link
Copy Markdown
Collaborator

@rainstormstudio
Hello, thank you for your contribution. Below are some suggestions for improvement:

  1. Your work introduces a new feature. Please add corresponding documentation in both Chinese and English, including environment setup, configuration details, usage instructions, and experimental results.

  2. Please remove the assets from this repository and require users to download them manually instead of including them in the PR. Correspondingly, provide clear instructions in the documentation on how to download these assets.

  3. Please refer to the following workflow to configure end-to-end tests:
    https://github.com/RLinf/RLinf/blob/main/.github/workflows/embodied-e2e-tests.yml
    Also, add a corresponding test YAML file under tests/e2e_tests/embodied in this repository for minimal validation.

  4. If there are training or evaluation results, please include the corresponding result curves or visualizations in the PR description.

  5. Before submitting new commits, please run pre-commit to perform code checks and ensure code quality.

Extract shared GPU simulation lifecycle, CUDA graph management,
rendering,
metrics, and auto-reset into a new `MuJoCoWarpEnv` base class.  Move
CartPole and CubePick task-specific logic into `tasks/cartpole.py` and
`tasks/cubepick.py` respectively.  Update `__init__.py` registration and
entry points to reference the renamed classes.
Introduce `MuJoCoWarpOffloadEnv` that runs the underlying environment in
a subprocess so GPU memory can be freed between rollout phases.  Extend
the base `MuJoCoWarpEnv` with state serialisation hooks and register the
offload path in `get_env_cls()` when `enable_offload` is set in config.
envs

Move the condition-check for rendering images into a shared
`_should_render_obs` / `_maybe_add_render_to_obs` helper, and add a new
`render_for_policy` flag under `video_cfg` so observations can be
rendered
independently of video saving.  Update the cartpole and cubepick tasks
to
use the helper.  Add a full OpenVLA PPO config for cubepick with
`render_for_policy: True` on both train and eval.
@rainstormstudio rainstormstudio marked this pull request as draft April 27, 2026 17:33
@Iron-Wph

Iron-Wph commented May 8, 2026

Copy link
Copy Markdown
Collaborator

@rainstormstudio Hello, thank you for your contribution. Below are some suggestions for improvement:

  1. Your work introduces a new feature. Please add corresponding documentation in both Chinese and English, including environment setup, configuration details, usage instructions, and experimental results.
  2. Please remove the assets from this repository and require users to download them manually instead of including them in the PR. Correspondingly, provide clear instructions in the documentation on how to download these assets.
  3. Please refer to the following workflow to configure end-to-end tests:
    https://github.com/RLinf/RLinf/blob/main/.github/workflows/embodied-e2e-tests.yml
    Also, add a corresponding test YAML file under tests/e2e_tests/embodied in this repository for minimal validation.
  4. If there are training or evaluation results, please include the corresponding result curves or visualizations in the PR description.
  5. Before submitting new commits, please run pre-commit to perform code checks and ensure code quality.

@rainstormstudio Hello, could you please fix this problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants