Iterative Compositional Data Generation for Robot Control

This repository contains the official implementation of the Iterative Compositional Data Generation (ICDG) pipeline introduced in "Iterative Compositional Data Generation for Robot Control" (Pham et al.). ICDG is a self-improving generative pipeline for robotic manipulation that uses a semantic compositional diffusion transformer to synthesize high-quality expert data for unseen tasks.

Robotic manipulation domains often contain a combinatorial number of possible tasks, arising from combinations of different components, such as robots, objects, obstacles, and objectives. Collecting real demonstrations for all combinations is prohibitively expensive. ICDG leverages the underlying compositional structure of these domains to generalize far beyond the tasks it has been trained on, enabling large-scale capability growth from limited real data.

Key Contributions

Semantic Compositional Diffusion Transformer:
Factorizes each transition into specific components and learns their interactions through attention, enabling strong compositional generalization.
Zero-Shot Generation:
Generates full state–action–next-state transitions for new task combinations that were never observed in real data.
Iterative Self-Improvement:
Synthetic data is evaluated using offline RL; only high-quality, policy-validated transitions are added back into the training pool, allowing the model to continuously refine itself without additional real data collection.
Data Efficiency and Generalization:
Trained on real data from approximately 20 percent of possible task combinations, ICDG generates useful data for the remaining tasks and ultimately solves nearly all held-out tasks.
Emergent Compositional Structure:
Attention patterns and intervention tests reveal that the model recovers meaningful task-factor dependencies, despite no hand-crafted structure being imposed.

Setup

Prerequisites

Python 3.9.6
CUDA-capable GPU (for training diffusion models and policies)
SLURM cluster access (for running experiments)

Installation

Create a Python 3.9.6 virtual environment:

python3.9 -m venv first_3.9.6
source venv/bin/activate  # On Linux/Mac

Install dependencies from requirements.txt:

pip install --upgrade pip
pip install -r requirements.txt

Note: The requirements.txt includes an editable install of CompoSuite from a specific git commit for reproducibility:

-e git+https://github.com/Lifelong-ML/CompoSuite.git@1fa36f67f31aeccc9ef75748bfc797960e044a86#egg=composuite

Set up the data directory:
- Download expert datasets from Dryad
- Organize the data according to the structure described in data/README.md
- Only expert datasets are needed for this project

Usage

Automated Iterative Compositional Data Generation

The main pipeline implements the iterative self-improvement procedure from the paper (see Figure 1). The process consists of:

Compositional Diffusion Training: Train the semantic compositional diffusion transformer on N expert datasets + M high-quality synthetic datasets from previous iterations
Zero-shot Data Generation: Generate synthetic transitions for all remaining task combinations (All combinations - N - M)
Offline RL Validation: Train policies on synthetic data and evaluate performance via offline RL
Quality-based Filtering:
- Good datasets: Added to training set for next iteration (M synthetic datasets)
- Bad datasets: Removed from future generation cycles
Iteration: Repeat until convergence or max iterations reached

Run the pipeline:

python3 -u -m scripts.automated_iterative_diffusion_dits_iiwa \
    --max_iterations 5 \
    --num_train 14 \
    --diffusion_seed 0 \
    --curriculum_seed 0 \
    2>&1 | tee iterative_diffusion_0_dits_iiwa.out

Key arguments:

--max_iterations: Maximum number of iterations to run
--num_train: Number of training tasks (14 for IIWA subset)
--diffusion_seed: Random seed for diffusion model training
--curriculum_seed: Random seed for curriculum schedule generation
--success_threshold: Success rate threshold for good tasks (default: 0.8)
--threshold_reduction_amount: Amount to reduce threshold by when no good tasks found (default: 0.1)
--threshold_reduction_cycle: Number of consecutive iterations with no good tasks before reducing threshold (default: 1)
--min_threshold: Minimum threshold value (default: 0.5)

Output:

Diffusion models: results/augmented_{iteration}/diffusion/
Synthetic data: results/augmented_{iteration}/diffusion/{model_name}/{task}/samples_0.npz
Policy checkpoints: results/augmented_{iteration}/policies/
Analysis logs: scripts/policies_slurm_logs/
Best test task dataset: results/best_testtask_dataset/

Semantic + Compositional RL Baseline with Transformer Policy

Run the transformer TD3+BC multitask baseline for comparison:

python3 -u -m scripts.run_transformer_baseline_pipeline \
    --num_train 14 \
    --seeds 10 11 12 13 14 \
    --memory 50 \
    --time 24 \
    2>&1 | tee multitask_Trans_OfflineRL_iwa_seed2.out

Key arguments:

--num_train: Number of training tasks (14 for IIWA subset)
--seeds: List of random seeds to run (e.g., 10 11 12 13 14)
--memory: Memory per job in GB (default: 50)
--time: Time limit per job in hours (default: 24)
--max_timesteps: Maximum training timesteps (default: 50000)
--batch_size: Batch size (default: 1792)

Output:

Model checkpoints: results/transformer_baseline/seed_{seed}/
Results CSV: results/transformer_baseline/transformer_baseline_results.csv
Training logs: scripts/transformer_baseline_logs/

Project Structure

.
├── data/                          # Dataset directory (see data/README.md)
├── results/                       # Experiment results
│   ├── augmented_{iteration}/     # Iterative diffusion results
│   └── transformer_baseline/      # Transformer baseline results
├── scripts/                       # Main scripts (including both baseline scripts
                                   and large-scale scripts, which mirror the
                                   representative structure shown below)
│   ├── automated_iterative_diffusion_dits_iiwa.py  # Main pipeline
│   ├── run_transformer_baseline_pipeline.py        # Transformer baseline
│   ├── train_augmented_diffusion.py               # Diffusion training
│   ├── train_augmented_policy.py                  # Policy training
│   └── generate_augmented_data_dits.py            # Data generation
├── diffusion/                     # Diffusion model code
├── corl/                          # Offline RL algorithms (TD3-BC, IQL)
├── config/                        # Configuration files
└── requirements.txt               # Python dependencies

Key Features

Semantic Compositional Architecture: Diffusion transformer with factorized components (robot, object, obstacle, objective)
Iterative Self-Improvement: Each iteration uses validated high-quality synthetic tasks to improve the diffusion model
Zero-shot Generation: Generates data for unseen task combinations without additional training
Automatic Retry: Failed jobs are automatically retried with increased resources
Curriculum Filtering: Component-specific curriculum filtering for iterations 5+ (optional)
Adaptive Threshold: Success threshold automatically reduces if no good tasks are found
Comprehensive Logging: Detailed logs and CSV analysis files for each iteration

Configuration

Default paths are set in the script configuration classes. Modify these in the scripts if needed:

base_path: Project root directory
data_path: Path to expert datasets
results_path: Path to save results
tasks_path: Path to task list JSON files

Citation

If you use this code, please cite:

@article{pham2025iterative,
  title={Iterative Compositional Data Generation for Robot Control},
  author={Pham, Anh-Quan and Hussing, Marcel and Patankar, Shubhankar P. and Bassett, Dani S. and Mendez-Mendez, Jorge and Eaton, Eric},
  journal={arXiv preprint arXiv:2512.10891},
  year={2025},
}

Related Resources:

CompoSuite Benchmark: GitHub
Datasets: Dryad

Contact

For inquiries, please contact Anh-Quan Pham.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Iterative Compositional Data Generation for Robot Control

Key Contributions

Setup

Prerequisites

Installation

Usage

Automated Iterative Compositional Data Generation

Semantic + Compositional RL Baseline with Transformer Policy

Project Structure

Key Features

Configuration

Citation

Contact

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
config		config
corl		corl
data		data
diffusion		diffusion
offline_compositional_rl_datasets		offline_compositional_rl_datasets
scripts		scripts
.gitignore		.gitignore
README.md		README.md

anhquanpham/iterative-comp-rl-generation

Folders and files

Latest commit

History

Repository files navigation

Iterative Compositional Data Generation for Robot Control

Key Contributions

Setup

Prerequisites

Installation

Usage

Automated Iterative Compositional Data Generation

Semantic + Compositional RL Baseline with Transformer Policy

Project Structure

Key Features

Configuration

Citation

Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages