Skip to content

zhw0422/muarm

Repository files navigation

🦾 muarm

Robot arm kinematics and manipulation learning in MuJoCo physics simulation, featuring Pinocchio-based FK/IK, Catmull-Rom trajectory planning, and a multi-robot RL/IL training framework supporting PPO, SAC, TD3, DDPG and Behavior Cloning.

Trajectory Impedence Control

πŸ“ Project Structure

manipulation_mujoco/
β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ franka_emika_panda/      # Franka Panda MJCF model + meshes
β”‚   └── trs_so_arm100/           # TRS SO-ARM100 MJCF model + meshes
β”œβ”€β”€ kinematic/
β”‚   β”œβ”€β”€ panda_kinematics.py      # PandaKinematics β€” FK / IK / Jacobian (Pinocchio)
β”‚   β”œβ”€β”€ trajectory.py            # TrajectoryGenerator β€” cubic, Catmull-Rom, Cartesian arc
β”‚   β”œβ”€β”€ run_fk.py                # Forward kinematics demo
β”‚   β”œβ”€β”€ run_ik.py                # Inverse kinematics demo
β”‚   └── run_trajectory.py        # Figure-8 Lissajous trajectory demo
β”œβ”€β”€ dynamics/
β”‚   β”œβ”€β”€ impedance_controller.py  # Task-space impedance control (Ο„ = J^T K e + Ο„_bias)
β”‚   └── admittance_controller.py # Task-space admittance control (virtual ODE + inner PD)
β”œβ”€β”€ learning/
β”‚   β”œβ”€β”€ envs/
β”‚   β”‚   β”œβ”€β”€ base_env.py          # MuJocoRobotEnv β€” robot-agnostic Gymnasium base class
β”‚   β”‚   β”œβ”€β”€ registry.py          # make_env("panda"/"so_arm", "reach"/"push"/"pick_place")
β”‚   β”‚   └── tasks/
β”‚   β”‚       β”œβ”€β”€ base_task.py     # BaseTask interface
β”‚   β”‚       β”œβ”€β”€ reach.py
β”‚   β”‚       β”œβ”€β”€ push.py
β”‚   β”‚       └── pick_place.py
β”‚   β”œβ”€β”€ robots/
β”‚   β”‚   β”œβ”€β”€ panda.py             # FrankaPandaEnv (4D Cartesian delta + gripper)
β”‚   β”‚   └── so_arm.py            # SoArm100Env (6D joint velocity)
β”‚   β”œβ”€β”€ algos/
β”‚   β”‚   β”œβ”€β”€ rl_trainer.py        # train_rl β€” PPO/SAC/TD3/DDPG + SuccessRateCallback
β”‚   β”‚   └── il/
β”‚   β”‚       └── bc.py            # Behavior Cloning (MLP + CosineAnnealingLR)
β”‚   β”œβ”€β”€ utils/
β”‚   β”‚   └── visualize.py         # MuJoCo overlay helpers
β”‚   β”œβ”€β”€ train.py                 # Unified training entry point
β”‚   └── play.py                  # Real-time policy playback
β”œβ”€β”€ requirements.txt
└── pyproject.toml

πŸ”§ Installation

cd manipulation_mujoco
uv venv
source .venv/bin/activate
uv pip install -r requirements.txt

πŸš€ Usage

▢️ Kinematics

source .venv/bin/activate

# Forward kinematics
python kinematic/run_fk.py

# Inverse kinematics
python kinematic/run_ik.py

# Figure-8 continuous trajectory with live EE trail
python kinematic/run_trajectory.py

▢️ Dynamics β€” Impedance & Admittance Control

Both demos use torque-controlled actuators (panda_motor.xml + scene_torque.xml). Interact via Ctrl + drag in the MuJoCo viewer.

source .venv/bin/activate

# Impedance control β€” drag and release, arm springs back
python dynamics/impedance_controller.py

# Admittance control β€” arm compliantly follows your applied force
python dynamics/admittance_controller.py

Control law summary:

Impedance Admittance
Input Displacement e External force F_ext
Output Joint torques directly Virtual reference β†’ inner PD
Feel Stiff spring Soft, mass-damper compliant
Equation Ο„ = J^T(KpΒ·e βˆ’ KdΒ·αΊ‹) + Ο„_bias M_d·ẍ_v = F_ext βˆ’ D_dΒ·αΊ‹_v βˆ’ K_dΒ·(x_vβˆ’x_eq)

Tune the parameters at the top of each file:

Parameter File Effect
KP / KD impedance_controller.py Spring stiffness / damping
M_D admittance_controller.py Lower β†’ faster response
D_D admittance_controller.py Higher β†’ smoother motion
K_D admittance_controller.py 0 = free drift, >0 = spring to eq

▢️ RL Training

source .venv/bin/activate

# Headless (recommended for speed)
python learning/train.py --robot panda --task reach     --algo sac  --timesteps 200000
python learning/train.py --robot panda --task push      --algo td3  --timesteps 300000
python learning/train.py --robot panda --task pick_place --algo sac --timesteps 500000

# Enable real-time viewer
python learning/train.py --robot panda --task reach --algo sac --timesteps 200000 --render

# Parallel environments (headless only)
python learning/train.py --robot panda --task reach --algo sac --timesteps 200000 --n-envs 4

# TensorBoard logging
python learning/train.py --robot panda --task reach --algo sac --timesteps 200000 --tensorboard
tensorboard --logdir learning/runs/tb

# SO-ARM100
python learning/train.py --robot so_arm --task reach --algo sac --timesteps 100000

▢️ IL β€” Behavior Cloning

source .venv/bin/activate

# Collect heuristic demos + train BC policy
python learning/train.py --robot panda --task reach --algo bc

# Visualize demo collection
python learning/train.py --robot panda --task reach --algo bc --render

▢️ Play β€” Real-time Policy Rollout

source .venv/bin/activate

# RL policy (SAC reach)
python learning/play.py --robot panda --task reach --algo sac \
    --model learning/runs/panda_reach_sac.zip --episodes 5

# BC policy
python learning/play.py --robot panda --task reach --algo bc \
    --model learning/runs/panda_reach_bc.pt --episodes 5

# Push / Pick-Place
python learning/play.py --robot panda --task push       --algo sac \
    --model learning/runs/panda_push_sac.zip
python learning/play.py --robot panda --task pick_place --algo sac \
    --model learning/runs/panda_pick_place_sac.zip

# SO-ARM100
python learning/play.py --robot so_arm --task reach --algo sac \
    --model learning/runs/so_arm_reach_sac.zip

πŸŽ›οΈ CLI Arguments β€” train.py / play.py

Argument Default Description
--robot panda panda | so_arm
--task reach reach | push | pick_place
--algo sac ppo | sac | td3 | ddpg | bc
--timesteps 200000 Total RL training steps
--n-envs 1 Number of parallel environments
--render False Enable MuJoCo real-time viewer
--tensorboard False Enable TensorBoard logging
--save-dir learning/runs Model save directory
--model β€” Path to saved model (play only)
--episodes 10 Rollout episodes (play only)

πŸ—‚οΈ Model Save Paths

learning/runs/{robot}_{task}_{algo}.zip   ← RL  (Stable-Baselines3)
learning/runs/{robot}_{task}_{algo}.pt    ← IL  (Behavior Cloning)

πŸ”Œ Adding a New Robot

  1. Create learning/robots/my_robot.py inheriting MuJocoRobotEnv
  2. Implement _build_spaces, _get_obs, _apply_action, get_ee_pos, get_ee_body
  3. Register it in learning/envs/registry.py under make_env
  4. Use the same train.py / play.py commands with --robot my_robot

πŸ“ TODO

About

Robot arm mujoco simulation environment(FK/IK/Trajectory/RL Envs).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages