FluxVLA Engine is a full-stack, end-to-end engineering platform for deploying embodied intelligence applications. Built on the core design principles of unified configuration, standardized interfaces, module decoupling, and deployability, it creates a complete engineering loop from data to real-device deployment. With the goal of providing a standardized industry–academia–research foundation, it significantly lowers the engineering barrier for VLA research and development.
| Codebase | Libero-Spatial | Libero-Object | Libero-Goal | Libero-Long | Libero-Average |
|---|---|---|---|---|---|
| FluxVLA(SmolVLA) | 86.2 | 92.4 | 91.4 | 68.8 | 84.7 |
| FluxVLA(GR00T) | 97.4 | 96.2 | 94.6 | 93.0±1.5 | 95.3 |
| FluxVLA(DreamZero) | 98.2 | 98.8 | 93.2 | 94.8 | 96.25 |
| FluxVLA(Qwen3VL 0.6B+GR00T) | 98.6 | 99.6 | 95.6 | 92.2±1.8 | 96.50 |
| FluxVLA(PI0) | 98.6 | 98.8 | 96.8 | 93.2 | 96.85 |
| FluxVLA(PI0.5) | 98.6 | 99.6 | 98.0 | 95.6±1.0 | 97.95 |
Linked scores point to the corresponding checkpoints.
| Model | Training Data | Cabinet | Drawer | Microwave | Generalization | Average |
|---|---|---|---|---|---|---|
| FluxVLA(GR00T) | 24 tasks, 30 demos | 27.5% | 37.5% | 45.0% | 50.3% | 46.9% |
Cabinet:PnPBottleToCabinetClose+PnPWineToCabinetClose.Drawer:PnPCanToDrawerClose+PnPCupToDrawerClose.Microwave:PnPMilkToMicrowaveClose+PnPPotatoToMicrowaveClose.Generalization: the remaining 18 post-train novel tasks.- All rates are micro-averaged over episodes.
[2026/06/10] 🔥 RoboCasa GR1 simulation tasks with GR00T are now supported.
[2026/06/04] 🔥 Triton backend for Pi0.5-RTC is now supported, see inference_acceleration.
[2026/05/28] 🔥 FluxDAgger is now released: a model-decoupled DAgger pipeline for dual-arm manipulation, making it easy to integrate different VLAs and reward models.
[2026/05/28] 🔥 The embodied manipulation simulation benchmark FluxBisim is now released.
[2026/05/09] 🔥 SmolVLA is now supported.
[2026/04/24] 🔥 Pi0.5-RTC is now supported.
[2026/04/22] 🔥 ZMQ-based remote inference framework is now supported.
[2026/04/15] 🔥 DreamZero WAM is now supported.
[2026/04/08] 🔥 FluxVLA has been open-sourced.
Note for existing installations
If you already cloned and installed FluxVLA(v0.1.0), you do not need to recreate the conda environment. Pull the latest code and upgrade Transformers:
git pull python -m pip install --upgrade "transformers==5.3.0" python -c "import transformers; print(transformers.__version__)"If you also want to use RoboCasa GR00T configs, install the RoboCasa-specific runtime dependencies in the same environment:
python -m pip install "mujoco==3.2.6" gymnasium lxml python -m pip install "robosuite @ git+https://github.com/yinchimaoliang/robosuite.git@7264a82"
1. Create a conda environment
conda create -n fluxvla python=3.10 -y
conda activate fluxvla2. Install PyTorch (CUDA version)
Important: Before running
pip install -r requirements.txt, you must install PyTorch from the official CUDA index first. The default PyPI index cannot fetch CUDA-enabled builds.
# CUDA 12.8
pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/cu128For other CUDA versions, replace cu128 with the corresponding value (e.g., cu118, cu121). See: https://pytorch.org/get-started/locally/ and https://pytorch.org/get-started/previous-versions/.
3. Install flash-attention
Method 1: Install directly via pip:
pip install psutil ninja packaging
# MAX_JOBS controls the number of parallel build threads; tune it based on your machine resources
MAX_JOBS=8 pip install flash-attn==2.5.5 --no-build-isolation --find-links https://github.com/Dao-AILab/flash-attention/releasesMethod 2: Build from source (recommended if method 1 fails):
git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention
git checkout v2.5.5
# MAX_JOBS controls the number of parallel build threads; tune it based on your machine resources
MAX_JOBS=8 python setup.py install4. Install av
conda install -c conda-forge av=14.4.05. Install fluxvla and other dependencies
pip install -r requirements.txt
pip install --no-build-isolation -e .Note:
requirements.txtpinstorch==2.6.0to prevent pip from accidentally replacing the CUDA-enabled PyTorch installed in step 2. If you need to use another torch version, update both the step-2 command and the torch version inrequirements.txt.
RoboCasa GR00T support (optional)
Install these extra dependencies only if you want to train or evaluate RoboCasa GR00T configs such as configs/gr00t/gr00t_eagle_3b_robocasa_finetune.py.
First install the RoboCasa runtime dependencies and the patched robosuite build:
pip install "mujoco==3.2.6" gymnasium lxml
pip install "robosuite @ git+https://github.com/yinchimaoliang/robosuite.git@7264a82"Then install Isaac-GR00T and the RoboCasa GR1 task package from local checkouts:
git clone https://github.com/NVIDIA/Isaac-GR00T.git /path/to/Isaac-GR00T
cd /path/to/Isaac-GR00T
git checkout 4af2b622892f7dcb5aae5a3fb70bcb02dc217b96
pip install --no-deps -e /path/to/Isaac-GR00T
git clone https://github.com/robocasa/robocasa-gr1-tabletop-tasks.git \
/path/to/robocasa-gr1-tabletop-tasks
cd /path/to/robocasa-gr1-tabletop-tasks
git checkout 4840e671596f93ca03651524b9f72ffb1aadfeff
pip install --no-deps -e /path/to/robocasa-gr1-tabletop-tasks--no-deps is recommended for editable installs so the RoboCasa packages do not replace the pinned FluxVLA model stack dependencies. RoboCasa assets and datasets are covered in Data & Assets Preparation.
Online evaluation environment (LIBERO / EGL)
If you want to evaluate LIBERO on devices that do not support ray tracing (e.g., A100), please refer to EGL Device GPU Rendering Configuration.
Install system dependencies
export MUJOCO_GL=egl
sudo apt install libegl-dev libgl1-mesa-dev libx11-dev libglew-dev libosmesa6-devEnvironment checks
Make sure /proc/1/environ contains the following environment variables:
NVIDIA_DRIVER_CAPABILITIES=allNVARCH=x86_64NVIDIA_REQUIRE_CUDA=cuda>=12.4brand=teslaanddriver>=470
Create an EGL configuration file
Create file /usr/share/glvnd/egl_vendor.d/10_nvidia.json with the following content:
{
"file_format_version": "1.0.0",
"ICD": {
"library_path": "libEGL_nvidia.so.0"
}
}Configure pre-commit hooks (optional but recommended)
To ensure code quality and consistency (especially for C++/CUDA code), install pre-commit hooks:
pip install pre-commit
pre-commit installThis will automatically check and format code before every commit.
Configure Weights & Biases (wandb)
Weights & Biases is used for experiment tracking and visualization. Configure it as follows:
- Install wandb (included in
requirements.txt):
pip install wandb- Log in to your wandb account:
wandb login- Set environment variables:
export WANDB_PROJECT=fluxvla # project name (default: fluxvla)
export WANDB_ENTITY=your-team-name # team name or username (default: None)
export WANDB_MODE=online # online, offline, or disabled (default: online)- If you want to disable wandb logging during training, set:
export WANDB_MODE=disabledNote: all wandb configuration is read from environment variables; no additional settings are needed in config files.
Configure TensorBoard (optional)
TensorBoard is supported as an optional logging backend for experiment metric visualization. Configure it as follows:
- Add
'tensorboard'toactive_trackersin your config file:
metric=dict(
type='VLAMetric',
active_trackers=('jsonl', 'wandb', 'tensorboard'),
...
)Alternatively, enable it via command line without modifying the config file:
--cfg-options 'runner.metric.active_trackers=[jsonl,wandb,tensorboard]'- After training, launch TensorBoard to view metrics:
tensorboard --logdir work_dirs/tensorboardNote: event files are saved to {work_dir}/tensorboard/{run_id}/ per run, enabling automatic comparison across experiments. If the TENSORBOARD_LOG_PATH environment variable is set, it will be used directly as the log directory.
Use the datasets we prepared directly
Download the required datasets and place them under ./datasets. Download only the datasets you need according to your configuration.
| Dataset | Download link |
|---|---|
| libero-object | limxdynamics/FluxVLAData/libero_object_no_noops_lerobotv2.1 |
| libero-spatial | limxdynamics/FluxVLAData/libero_spatial_no_noops_lerobotv2.1 |
| libero-10 | limxdynamics/FluxVLAData/libero_10_no_noops_lerobotv2.1 |
| libero-goal | limxdynamics/FluxVLAData/libero_goal_no_noops_lerobotv2.1 |
| modified_libero_rlds | openvla/modified_libero_rlds |
| RoboCasa GR1 (30 demos) | limxdynamics/FluxVLAData/robocasa_gr1_24tasks_first30ep |
| RoboCasa GR1 | limxdynamics/FluxVLAData/robocasa_lerobot_V2.1 |
| RealRobot_AgileX_aloha | limxdynamics/FluxVLAData/RealRobot_AgileX_aloha_lerobot_v2 |
| RealRobot_UR3_Chem | limxdynamics/FluxVLAData/RealRobot_UR3_Chem_lerobot_v2 |
For example, download the libero-10 dataset:
huggingface-cli download limxdynamics/FluxVLAData --repo-type dataset --include "libero_10_no_noops_lerobotv2.1/*" --local-dir ./datasetsReplace libero_10_no_noops_lerobotv2.1 with the corresponding folder name of the dataset you want to download.
For RoboCasa GR00T training with the released 30-demo subset, download the
dataset under ./datasets:
huggingface-cli download limxdynamics/FluxVLAData \
--repo-type dataset \
--include "robocasa_gr1_24tasks_first30ep/*" \
--local-dir ./datasetsFor full-data RoboCasa GR1 training, replace the include pattern with
robocasa_lerobot_V2.1/*.
Prepare assets
Download the required assets and place them under the local directories expected by your configuration or simulator.
| Asset | Download link | Local directory |
|---|---|---|
| RoboCasa tabletop simulator assets | nvidia/PhysicalAI-DigitalCousin-Assets | /path/to/robocasa-gr1-tabletop-tasks/robocasa/models/assets |
Recommended option: run the upstream asset downloader from the RoboCasa GR1 task checkout:
cd /path/to/robocasa-gr1-tabletop-tasks
python robocasa/scripts/download_tabletop_assets.py -yAlternative option: download the mirrored assets from Hugging Face and place
them directly under
/path/to/robocasa-gr1-tabletop-tasks/robocasa/models/assets.
Symlinks are not required; they are only a convenience when the assets already
live on another local disk or shared storage.
SARM datasets
FluxVLA SARM workflows accept standard LeRobot v2.1 or v3.x datasets. Besides the usual observation / action fields, the dataset must carry SARM subtask annotations in episodes metadata.
Published SARM example datasets on Hugging Face:
- LeRobot v3.x manual sparse+dense annotations for training / inference: limxdynamics/FluxVLAData/SARM_manual_test_10Episodes_lerobotv3.0
- LeRobot v3.x unlabeled dataset kept for manual or VLM labeling: limxdynamics/FluxVLAData/SARM_vlm_test_10Episodes_lerobotv3.0
- New LeRobot v2.1 manual conversion for training / inference and legacy-tool compatibility: limxdynamics/FluxVLAData/SARM_manual_test_10Episodes_lerobotv2.1
- New LeRobot v2.1 unlabeled conversion for manual or VLM labeling workflows: limxdynamics/FluxVLAData/SARM_vlm_test_10Episodes_lerobotv2.1
Download them under ./datasets with:
huggingface-cli download limxdynamics/FluxVLAData --repo-type dataset --include "SARM_manual_test_10Episodes_lerobotv3.0/*" --local-dir ./datasets
huggingface-cli download limxdynamics/FluxVLAData --repo-type dataset --include "SARM_vlm_test_10Episodes_lerobotv3.0/*" --local-dir ./datasets
huggingface-cli download limxdynamics/FluxVLAData --repo-type dataset --include "SARM_manual_test_10Episodes_lerobotv2.1/*" --local-dir ./datasets
huggingface-cli download limxdynamics/FluxVLAData --repo-type dataset --include "SARM_vlm_test_10Episodes_lerobotv2.1/*" --local-dir ./datasetsUse the manual_* datasets directly for training / inference. Use the vlm_* datasets as clean starting points for manual stage writing or VLM auto-annotation. Prefer the v2.1 pair when another tool expects meta/episodes.jsonl plus per-episode videos; prefer the v3.0 pair when you want to keep native LeRobot v3.x metadata layout.
Before using a LeRobot v3.x SARM dataset, sanity-check the video metadata:
-
LeRobot v3.x allows either many episodes in one MP4 or one MP4 per episode.
-
If many episodes share one MP4, each episode that points to that file must use correct
from_timestamp/to_timestampoffsets. -
If videos are already split as
file-000.mp4,file-001.mp4, ..., each episode should point to its ownfile_index, andfrom_timestampwill usually reset to0.0. -
If the directory contains multiple MP4 files but all episodes still point to
file-000.mp4, the dataset metadata is malformed and should be fixed before use. -
For ready-to-use SARM dataset structure, annotation columns, and progress inference usage, see docs/sarm.md.
-
For writing manual stages or generating VLM-based annotations, see tools/sarm_annotate/README.md.
Private dataset directory structure
If you train with fluxvla on private datasets, you need to convert your raw data (e.g., HDF5 files collected by ALOHA robots) into the LeRobot Dataset v2.1 format. For a step-by-step conversion guide, see Data Conversion Guide.
For SARM specifically, FluxVLA supports both LeRobot v2.1 and v3.x datasets as long as the required SARM annotation columns are present. The SARM-specific metadata contract is documented in docs/sarm.md.
The converted dataset should follow this directory structure:
├── data
│ └── chunk000
│ │ └── episode_000000.parquet
│ │ └── episode_000001.parquet
│ │ └── ... (more parquet files)
│ │ └── episode_00000N.parquet
│ └── chunk001
│ └── ... (more chunks)
│ └── chunk00N
├── meta
│ └── episodes.jsonl
│ └── episodes_stats.jsonl
│ └── info.json
│ └── tasks.jsonl
├── videos
│ └── chunk000
│ │ └── camera name 0
│ │ │ └── episode_000000.mp4
│ │ │ └── episode_000001.mp4
│ │ │ └── ...(more mp4 files)
│ │ │ └── episode_00000N.mp4
│ │ └── camera name 1
│ └── chunk001
│ └── ... (more chunks)
│ └── chunk00N
Download the required pretrained checkpoints and place them under ./checkpoints. Download only the checkpoints you need based on your configuration.
For SARM workflows, you typically need a CLIP checkpoint for training / inference and optionally a Qwen3-VL checkpoint for VLM-based annotation. Detailed usage is documented in docs/sarm.md.
VLA models
| Model | Size | Download link |
|---|---|---|
| GR00T N1.5 | 3B | 🤗 Hugging Face |
| OpenVLA | 7B | 🤗 Hugging Face |
| PI0_base | 3B | 🤗 Hugging Face |
| PI05_base | 3B | 🤗 Hugging Face |
| PI05_libero | 3B | 🤗 Hugging Face |
| SmolVLA | 450M | 🤗 Hugging Face |
Vision-Language Models (VLM)
| Model | Size | Download link |
|---|---|---|
| Qwen2.5-VL | 3B | 🤗 Hugging Face |
| Qwen3-VL | 30B | 🤗 Hugging Face |
| SmolVLM2 | 500M | 🤗 Hugging Face |
Large Language Models (LLM)
| Model | Size | Download link |
|---|---|---|
| Qwen 2.5 | 3B | 🤗 Hugging Face |
| Qwen 2.5 | 7B | 🤗 Hugging Face |
| Llama 2 | 7B | 🤗 Hugging Face |
Vision backbone networks
| Model | Download link |
|---|---|
| CLIP ViT-B/32 | 🤗 Hugging Face |
| ViT-Large (DINOv2) | 🤗 Hugging Face |
| ViT-SO400M (SigLIP) | 🤗 Hugging Face |
| SigLIP2 | 🤗 Hugging Face |
| paligemma | 🤗 Hugging Face |
Tip: You can speed up downloads with
huggingface-cli download <model-name> --local-dir ./checkpoints/<model-name>.
For the built-in SARM configs, place the CLIP files under ./checkpoints/clip-vit-base-patch32. If you use VLM-based SARM annotation, place the official SARM VLM under ./checkpoints/Qwen3-VL-30B-A3B-Instruct.
All-in-one: One configuration file manages the full workflow
- Manage key parameters for data, models, training, evaluation, inference, and deployment through a single config file (easier to reproduce and deploy).
Supports different VLA models
- Supports OpenVLA, LlavaVLA, Gr00t, Pi0, and Pi0.5.
Supports different modules
- Supports Llama, Gemma, and Qwen-family LLM backbones.
- Supports DINOv2 and SigLIP vision backbones.
- Supports PaliGemma and Qwen-VL VLM backbones.
Supports SARM workflows
- Supports SARM training, annotation, and progress inference on LeRobot v2.1/v3.x datasets. See docs/sarm.md for details.
Supports different training strategies
- Supports FSDP together with DDP, and supports LoRA training mode.
- Supports eval-after-train.
- Supports resuming training from checkpoints.
Data and weight formats
- Supports Parquet datasets and loading LeRobot-format data.
- Supports model weights in safetensors format.
Evaluation and inference capabilities
- Supports multi-GPU evaluating libero on devices without ray tracing.
- Supports remote inference infrastructure with ZMQ-based server/client architecture, enabling GPU-offloaded inference for resource-constrained edge devices. See Remote Inference Serving.
- Supports RTC (Real-Time Chunking) to improve cross-chunk trajectory continuity.
- Supports accelerated inference for GR00T and PI0.5; see Inference Acceleration, including Triton fused kernels, CUDA Graph capture, and CUDA custom operators.
Local debugging
/root/miniconda3/envs/fluxvla/bin/torchrun --standalone --nnodes 1 --nproc-per-node [NUM_GPUS] scripts/train.py --config [CONFIG_PATH] --work-dir [WORK_DIR] --cfg-options train_dataloader.per_device_batch_size=[PER_DEVICE_BATCH_SIZE]
Example:
export WANDB_MODE=disabled
/root/miniconda3/envs/fluxvla/bin/torchrun --standalone --nnodes 1 --nproc-per-node 2 scripts/train.py --config configs/pi05/pi05_paligemma_libero_10_full_finetune.py --work-dir ./checkpoints/pi05_paligemma_libero_10_full_finetune --cfg-options train_dataloader.per_device_batch_size=2
RoboCasa GR00T smoke training example:
WANDB_MODE=disabled TOKENIZERS_PARALLELISM=false \
torchrun --standalone --nnodes 1 --nproc-per-node 1 scripts/train.py \
--config configs/gr00t/gr00t_eagle_3b_robocasa_finetune.py \
--work-dir work_dirs/smoke_groot_robocasa_train \
--cfg-options \
runner.type=FSDPTrainRunner \
runner.sharding_strategy=no-shard \
train_dataloader.per_device_batch_size=1 \
runner.enable_gradient_checkpointing=False \
runner.max_steps=2 \
runner.save_iter_interval=1 \
runner.max_keep_ckpts=2 \
"runner.metric.active_trackers=('jsonl',)"Local evaluation
/root/miniconda3/envs/fluxvla/bin/torchrun --standalone --nnodes 1 --nproc-per-node [NUM_GPUS] scripts/eval.py --config [CONFIG_PATH] --ckpt-path [CKPT_PATH] --cfg-options [CFG_OPTIONS]
Example:
export WANDB_MODE=disabled
/root/miniconda3/envs/fluxvla/bin/torchrun --standalone --nnodes 1 --nproc-per-node 2 scripts/eval.py --config configs/pi05/pi05_paligemma_libero_10_full_finetune.py --ckpt-path checkpoints/pi05_paligemma_libero_10_full_finetune_bs64/checkpoints/step-028548-epoch-18-loss=0.0111.safetensors
RoboCasa GR00T evaluation example:
MUJOCO_GL=egl WANDB_MODE=disabled TOKENIZERS_PARALLELISM=false \
torchrun --standalone --nnodes 1 --nproc-per-node 1 scripts/eval.py \
--config configs/gr00t/gr00t_eagle_3b_robocasa_finetune.py \
--ckpt-path work_dirs/gr00t_eagle_3b_robocasa_gr1_24x30_finetune_bs64/checkpoints/step-010000.safetensors \
--cfg-options \
eval.norm_stats_path=work_dirs/official_groot_gr1_dataset_statistics.json \
eval.output_dir=work_dirs/gr00t_eagle_3b_robocasa_eval \
eval.num_trials_per_task=20Cluster training
export WANDB_MODE=disabled
bash scripts/train.sh [CONFIG] [WORK_DIR] --cfg-options train_dataloader.per_device_batch_size=[PER_DEVICE_BATCH_SIZE] train_dataloader.batch_size=[GLOBAL_BATCH_SIZE] runner.max_steps=[MAX_STEPS] runner.save_interval=[SAVE_INTERVAL] runner.max_keep_ckpts=[MAX_KEEP_CKPTS] --eval-after-train
Resume training from a checkpoint
To resume training from a checkpoint, use the --resume-from argument to specify the checkpoint file path. Training will continue from the saved global step, epoch, model state, and optimizer state.
Local training example:
export WANDB_MODE=disabled
/root/miniconda3/envs/fluxvla/bin/torchrun --standalone --nnodes 1 --nproc-per-node 2 scripts/train.py \
--config configs/pi05/pi05_paligemma_libero_10_full_finetune.py \
--work-dir ./work_dirs/pi05_paligemma_libero_10_full_finetune \
--resume-from ./work_dirs/pi05_paligemma_libero_10_full_finetune/checkpoints/checkpoint_epoch_5.pt \
--cfg-options train_dataloader.per_device_batch_size=2
Cluster training example:
export WANDB_MODE=disabled
bash scripts/train.sh [CONFIG] [WORK_DIR] \
--resume-from [CHECKPOINT_PATH] \
--cfg-options train_dataloader.per_device_batch_size=[PER_DEVICE_BATCH_SIZE] runner.max_steps=[MAX_STEPS]
Cluster evaluation
export WANDB_MODE=disabled
bash scripts/eval.sh [CONFIG] [CKPT_PATH] --cfg-options [CFG_OPTIONS]
Real-robot inference
When running inference on a real robot, first install the environment on the robot side, and then run:
python scripts/inference_real_robot.py --config [CONFIG] -- ckpt-path [CKPT_PATH]
Q: Problems connecting to Hugging Face when downloading models or datasets.
A: If you encounter Hugging Face connectivity issues (e.g., slow downloads, timeouts, or connection refused), set the following environment variable before running the command and use hf-mirror:
export HF_ENDPOINT="https://hf-mirror.com"Q: conda install av is very slow at resolving the environment.
A: You can use the libmamba solver to speed up dependency resolution:
conda install -c conda-forge av=14.4.0 --solver=libmambaQ: GR00T evaluation on LIBERO is unstable.
A: This is expected. GR00T's performance on LIBERO is sensitive to random seeds, the hardware environment, and the number of training epochs. Small changes in these factors may cause noticeable fluctuations in evaluation results. It is recommended to run experiments with multiple random seeds and select the best checkpoint based on evaluation performance.
Q: When running pip install -r requirements.txt, building egl_probe fails with RuntimeError: CMake must be installed.
A: egl_probe needs CMake to build. Install it via conda (recommended) or apt:
conda install -c conda-forge cmake
# or
sudo apt install cmakeNote: Do not use
pip install cmake. The pip package is a Python wrapper and may fail because pip isolates the build environment.
Q: egl_probe build fails and reports Compatibility with CMake < 3.5 has been removed from CMake.
A: This is usually because your CMake version is too new for the egl_probe CMakeLists.txt. Set the following environment variable before installing:
CMAKE_POLICY_VERSION_MINIMUM=3.5 pip install -r requirements.txtQ: After installation, I get NumPy version errors (e.g., RuntimeError: Numpy is not available or version incompatibility warnings).
A: During installation, some dependencies may overwrite the pinned NumPy version. Reinstall the correct version directly:
pip install numpy==1.26.4Please see the contribution workflow and guidelines in docs/CONTRIBUTING.md.
Quick conventions:
- Discuss first: for new features/models or other large changes, please open a GitHub Issue to align on scope and design.
- Branch from upstream: create your branch from
upstream/mainand use prefixes likefeat/,fix/,docs/, etc. (details in the contributing guide). - Run checks before PR: make sure local pre-commit passes and CI is green.
- Commit messages: we recommend Conventional Commits (examples in the contributing guide).
If you encounter any issues while using this repository, feel free to contact us. You can reach us directly at mason@limxdynamics.com and wayne@limxdynamics.com, or open a GitHub issue for help.
If you use FluxVLA in your research or projects, please cite it as:
@software{FluxVLA2026,
author = {Li, Yinhao and Mao, Weixin and Lan, Zihan and Rong, Jikun and Zhu, Minzhao and Mao, Yiming and Shen, Bowen and Huang, Xu},
title = {{FluxVLA Engine: A One-Stop VLA Engineering Platform for Embodied Intelligence}},
year = {2026},
month = apr,
version = {1.0.0},
doi = {10.5281/zenodo.20049506},
url = {https://github.com/FluxVLA/FluxVLA},
license = {Apache-2.0},
}Acknowledgements: This project benefits from the following open-source projects and community efforts. Thanks to: LeRobot, NVIDIA Isaac GR00T, DreamZero (code), OpenVLA, OpenPI (pi0), LLaVA, DeepSpeed, Qwen, Triton, RTC, Training RTC, and Realtime-VLA. If we missed your project or contribution, please open an issue or pull request so we can properly acknowledge it.
- Support more vision backbone networks.
- Support more VLM backbones.
- Support more VLA methods.
- Support training with VLM data or reasoning-chain-of-thought (CoT) data.
- Full implementation of the logger feature.
- Support Isaac Sim.