Skip to content

song-wensong/insert-anything

Repository files navigation

Insert Anything (AAAI'26 Oral)

Wensong Song · Hong Jiang · Zongxing Yang · Ruijie Quan · Yi Yang

Paper PDF Project Page
Zhejiang University   |   Harvard University   |   Nanyang Technological University

🔥 News

  • [2025.7.4] Released a separate test dataset in the Google Drive.
  • [2025.6.3] Separate the ComfyUI code into a new repository.
  • [2025.6.1] Release a new ComfyUI workflow! No need to download the full model folder!
  • [2025.5.23] Release the training code for users to reproduce results and adapt the pipeline to new tasks!
  • [2025.5.13] Release AnyInsertion text-prompt dataset on HuggingFace.
  • [2025.5.9] Release demo video of the Hugging Face Space, now available on YouTube and Bilibili.
  • [2025.5.7] Release inference for nunchaku demo to support 10GB VRAM.
  • [2025.5.6] Support ComfyUI integration for easier workflow management.
  • [2025.5.6] Update inference demo to support 26GB VRAM, with increased inference time.
  • [2025.4.26] Support online demo on HuggingFace.
  • [2025.4.25] Release AnyInsertion mask-prompt dataset on HuggingFace.
  • [2025.4.22] Release inference demo and pretrained checkpoint.

💡 Demo

Insert Anything Teaser

For more demos and detailed examples, check out our project page: Project Page

🛠️ Installation

Begin by cloning the repository:

git clone https://github.com/song-wensong/insert-anything
cd insert-anything

Installation Guide for Linux

Conda's installation instructions are available here.

conda create -n insertanything python==3.10

conda activate insertanything

pip install -r requirements.txt

⏬ Download Checkpoints

10 VRAM :

  • Insert Anything Model: Download the main checkpoint from HuggingFace and replace /path/to/lora-for-nunchaku in inference_for_nunchaku.py.

  • FLUX.1-Fill-dev Model: This project relies on FLUX.1-Fill-dev and FLUX.1-Redux-dev as components. Download its checkpoint(s) as well and replace /path/to/black-forest-labs-FLUX.1-Fill-dev and /path/to/black-forest-labs-FLUX.1-Redux-dev.

  • Nunchaku-FLUX.1-Fill-dev Model: Download the main checkpoint from HuggingFace and replace /path/to/svdq-int4-flux.1-fill-dev.

26 or 40 VRAM :

  • Insert Anything Model: Download the main checkpoint from HuggingFace and replace /path/to/lora in inference.py and app.py.

  • FLUX.1-Fill-dev Model: This project relies on FLUX.1-Fill-dev and FLUX.1-Redux-dev as components. Download its checkpoint(s) as well and replace /path/to/black-forest-labs-FLUX.1-Fill-dev and /path/to/black-forest-labs-FLUX.1-Redux-dev.

🎥 Inference

10 VRAM

We are very grateful to @judian17 for providing the nunchaku version of LoRA.After downloading the required weights, you need to go to the official nunchaku repository to install the appropriate version of nunchaku.

python inference_for_nunchaku.py

26 or 40 VRAM

python inference.py

🖥️ Gradio

Using Command Line

python app.py

🧩 ComfyUI

We have specially created a repository for the workflow and you can check the repository and have a try!

🧩 ComfyUI in community

We deeply appreciate the community of developers who have created innovative applications based on the Insert Anything model. Throughout this development process, we have received invaluable feedback. As we continue to enhance our models, we will carefully consider these insights to further optimize our models and provide users with a better experience.

Below is a selection of community‑created workflows along with their corresponding tutorials:

Workflow Author Tutorial
Insert Anything极速万物迁移图像编辑优化自动版 T8star-Aix YouTube | Bilibili

💡 Tips

🔷 To run mask-prompt examples, you may need to obtain the corresponding masks. You can choose to use Grounded SAM or the draw_mask script provided by us

python draw_mask.py 

🔷 The mask must fully cover the area to be edited.

⏬ Download Dataset

  • AnyInsertion dataset: Download the AnyInsertion dataset from HuggingFace.

🚀 Training

🔷 Mask-prompt Training

  • Replace flux model paths: Replace /path/to/black-forest-labs-FLUX.1-Fill-dev and /path/to/black-forest-labs-FLUX.1-Redux-dev in experiments/config/insertanything.yaml

  • Download mask-prompt dataset: Download the AnyInsertion mask-prompt dataset from HuggingFace.

  • Convert parquet to image: Use the script parquet_to_image.py to convert Parquet files to images.

  • Test(Optional): If you want to perform testing during the training process, you can modify the test path under the specified file src/train/callbacks.py(line 350). The default does not require a testing process.

  • Run the training code: Follow the instruction :

    bash scripts/train.sh

🤝 Acknowledgement

We appreciate the open source of the following projects:

Citation

@article{song2025insert,
  title={Insert Anything: Image Insertion via In-Context Editing in DiT},
  author={Song, Wensong and Jiang, Hong and Yang, Zongxing and Quan, Ruijie and Yang, Yi},
  journal={arXiv preprint arXiv:2504.15009},
  year={2025}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published