Insert Anything (AAAI'26 Oral)

Wensong Song · Hong Jiang · Zongxing Yang · Ruijie Quan · Yi Yang

Zhejiang University | Harvard University | Nanyang Technological University

🔥 News

[2025.7.4] Released a separate test dataset in the Google Drive.
[2025.6.3] Separate the ComfyUI code into a new repository.
[2025.6.1] Release a new ComfyUI workflow! No need to download the full model folder!
[2025.5.23] Release the training code for users to reproduce results and adapt the pipeline to new tasks!
[2025.5.13] Release AnyInsertion text-prompt dataset on HuggingFace.
[2025.5.9] Release demo video of the Hugging Face Space, now available on YouTube and Bilibili.
[2025.5.7] Release inference for nunchaku demo to support 10GB VRAM.
[2025.5.6] Support ComfyUI integration for easier workflow management.
[2025.5.6] Update inference demo to support 26GB VRAM, with increased inference time.
[2025.4.26] Support online demo on HuggingFace.
[2025.4.25] Release AnyInsertion mask-prompt dataset on HuggingFace.
[2025.4.22] Release inference demo and pretrained checkpoint.

💡 Demo

For more demos and detailed examples, check out our project page:

🛠️ Installation

Begin by cloning the repository:

git clone https://github.com/song-wensong/insert-anything
cd insert-anything

Installation Guide for Linux

Conda's installation instructions are available here.

conda create -n insertanything python==3.10

conda activate insertanything

pip install -r requirements.txt

⏬ Download Checkpoints

10 VRAM :

Insert Anything Model: Download the main checkpoint from HuggingFace and replace /path/to/lora-for-nunchaku in inference_for_nunchaku.py.
FLUX.1-Fill-dev Model: This project relies on FLUX.1-Fill-dev and FLUX.1-Redux-dev as components. Download its checkpoint(s) as well and replace /path/to/black-forest-labs-FLUX.1-Fill-dev and /path/to/black-forest-labs-FLUX.1-Redux-dev.
Nunchaku-FLUX.1-Fill-dev Model: Download the main checkpoint from HuggingFace and replace /path/to/svdq-int4-flux.1-fill-dev.

26 or 40 VRAM :

Insert Anything Model: Download the main checkpoint from HuggingFace and replace /path/to/lora in inference.py and app.py.
FLUX.1-Fill-dev Model: This project relies on FLUX.1-Fill-dev and FLUX.1-Redux-dev as components. Download its checkpoint(s) as well and replace /path/to/black-forest-labs-FLUX.1-Fill-dev and /path/to/black-forest-labs-FLUX.1-Redux-dev.

🎥 Inference

10 VRAM

We are very grateful to @judian17 for providing the nunchaku version of LoRA.After downloading the required weights, you need to go to the official nunchaku repository to install the appropriate version of nunchaku.

python inference_for_nunchaku.py

26 or 40 VRAM

python inference.py

🖥️ Gradio

Using Command Line

python app.py

🧩 ComfyUI

We have specially created a repository for the workflow and you can check the repository and have a try!

🧩 ComfyUI in community

We deeply appreciate the community of developers who have created innovative applications based on the Insert Anything model. Throughout this development process, we have received invaluable feedback. As we continue to enhance our models, we will carefully consider these insights to further optimize our models and provide users with a better experience.

Below is a selection of community‑created workflows along with their corresponding tutorials:

Workflow	Author	Tutorial
Insert Anything极速万物迁移图像编辑优化自动版	T8star-Aix	YouTube \| Bilibili

💡 Tips

🔷 To run mask-prompt examples, you may need to obtain the corresponding masks. You can choose to use Grounded SAM or the draw_mask script provided by us

python draw_mask.py

🔷 The mask must fully cover the area to be edited.

⏬ Download Dataset

AnyInsertion dataset: Download the AnyInsertion dataset from HuggingFace.

🚀 Training

🔷 Mask-prompt Training

Replace flux model paths: Replace /path/to/black-forest-labs-FLUX.1-Fill-dev and /path/to/black-forest-labs-FLUX.1-Redux-dev in experiments/config/insertanything.yaml
Download mask-prompt dataset: Download the AnyInsertion mask-prompt dataset from HuggingFace.
Convert parquet to image: Use the script parquet_to_image.py to convert Parquet files to images.
Test(Optional): If you want to perform testing during the training process, you can modify the test path under the specified file src/train/callbacks.py(line 350). The default does not require a testing process.
Run the training code: Follow the instruction :
```
bash scripts/train.sh
```

🤝 Acknowledgement

We appreciate the open source of the following projects:

Citation

@article{song2025insert,
  title={Insert Anything: Image Insertion via In-Context Editing in DiT},
  author={Song, Wensong and Jiang, Hong and Yang, Zongxing and Quan, Ruijie and Yang, Yi},
  journal={arXiv preprint arXiv:2504.15009},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Insert Anything (AAAI'26 Oral)

🔥 News

💡 Demo

🛠️ Installation

Installation Guide for Linux

⏬ Download Checkpoints

🎥 Inference

10 VRAM

26 or 40 VRAM

🖥️ Gradio

Using Command Line

🧩 ComfyUI

🧩 ComfyUI in community

💡 Tips

⏬ Download Dataset

🚀 Training

🔷 Mask-prompt Training

🤝 Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
docs		docs
examples		examples
experiments/config		experiments/config
scripts		scripts
src		src
utils		utils
LICENSE		LICENSE
README.md		README.md
app.py		app.py
draw_mask.py		draw_mask.py
inference.py		inference.py
inference_for_nunchaku.py		inference_for_nunchaku.py
parquet_to_image.py		parquet_to_image.py
requirements.txt		requirements.txt

License

song-wensong/insert-anything

Folders and files

Latest commit

History

Repository files navigation

Insert Anything (AAAI'26 Oral)

🔥 News

💡 Demo

🛠️ Installation

Installation Guide for Linux

⏬ Download Checkpoints

🎥 Inference

10 VRAM

26 or 40 VRAM

🖥️ Gradio

Using Command Line

🧩 ComfyUI

🧩 ComfyUI in community

💡 Tips

⏬ Download Dataset

🚀 Training

🔷 Mask-prompt Training

🤝 Acknowledgement

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages