DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models 🎨

🔥 News

2025-03-17: Our paper DreamRenderer is now available on arXiv and Supplementary Material is released.
2025-03-20: We release the code! 🎉
2025-05-20: We have released the code for integrating DreamRenderer with SD3.

📝 Introduction

DreamRenderer is a training-free method built upon the FLUX model that enables users to precisely control the content of each instance through bounding boxes or masks while ensuring overall visual harmony.

✅ To-Do List

🛠️ Installation

🚀 Checkpoints

Download the checkpoint for SAM2, sam2_hiera_large.pt, and place it in the pretrained_weights directory as shown below:

├── pretrained_weights
│   ├── sam2_hiera_large.pt
├── DreamRenderer
│   ├── ...
├── scripts
│   ├── ...

💻 Environment Setup

# Create and activate conda environment
conda create -n dreamrenderer python=3.10 -y
conda activate dreamrenderer

# Install dependencies
pip install -r requirements.txt
pip install -e .

# Install segment-anything-2
cd segment-anything-2
pip install -e . --no-deps
cd ..

🧩 Region/Instance Controllable Rendering

You can quickly use DreamRenderer for precise rendering with the following commands:

python scripts/inference_demo0.py --use_sam_enhance

python scripts/inference_demo1.py --use_sam_enhance

python scripts/inference_demo2.py --num_hard_control_steps=15

🔌 Support for ControlNet (rough implementation version)

In the original paper, we used FLUX-depth and FLUX-canny for image-conditioned generation. Now, we also provide a script that supports image-conditioned generation via ControlNet:

python scripts/inferenceCN_demo0.py --res=768

🔌 Support for SD3 (rough implementation version)

To further demonstrate the generalizability of our method, we integrated DreamRenderer with another DiT-based architecture, SD3. We use ControlNet to guide generation based on depth:

python scripts/inference_demo5.py  --use_sam_enhance

🖼️ End-to-End Layout-to-Image Generation

DreamRenderer supports re-rendering outputs from state-of-the-art Layout-to-Image models, enhancing image quality and allowing for fine-grained control over each instance in the layout.

Here's how it works:

A Layout-to-Image method first generates a coarse image based on the input layout.
We extract a depth map from this image.
DreamRenderer then re-renders the scene, guided by the original layout, to produce a higher-quality and more faithful result.

📦 1. Install Depth Map Extraction (Depth-Anything v2)

We use Depth-Anything v2 for extracting depth maps. To enable this feature, follow these steps:

Step 1: Install the Depth-Anything package

cd Depth-Anything-V2
pip install -e .
cd ..

Step 2: Download Model Weights

Download the Depth-Anything v2 model (depth_anything_v2_vitl.pth) and place it in the pretrained_weights directory:

├── pretrained_weights
│   ├── depth_anything_v2_vitl.pth
├── DreamRenderer
│   ├── ...
├── scripts
│   ├── ...

🚀 2. Run End-to-End Generation

Once everything is set up, you can run the following commands to achieve end-to-end layout-to-image generation.

End-to-end layout-to-image generation with MIGC (download MIGC_SD14.ckpt and put it in pretrained_weights):

python scripts/inference_demo3.py --res=768 --use_sam_enhance --num_hard_control_steps=15

End-to-end layout-to-image generation with InstanceDiffusion (download instancediffusion_sd15.pth and put it in pretrained_weights):

python scripts/inference_demo4.py --use_sam_enhance --num_hard_control_steps=10 --res=768

We will soon integrate with more SOTA layout-to-image methods. Stay tuned!

📊 Comparison with Other Models

🙏 Acknowledgements

We would like to thank the developers of FLUX, Segment Anything Model, Depth-Anything, diffusers, CLIP, and other open-source projects that made this work possible. We appreciate their outstanding contributions.

📚 Citation

If you find this repository useful, please cite using the following BibTeX entry:

@misc{zhou2025dreamrenderer,
      title={DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models},
      author={Dewei Zhou and Mingwei Li and Zongxin Yang and Yi Yang},
      year={2025},
      eprint={2503.12885},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2503.12885},
}

📬 Contact

If you have any questions or suggestions, please feel free to contact us 😆!

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
Depth-Anything-V2		Depth-Anything-V2
DreamRenderer		DreamRenderer
L2I_methods		L2I_methods
data		data
figures		figures
pretrained_weights		pretrained_weights
scripts		scripts
segment-anything-2		segment-anything-2
static		static
README.md		README.md
index.html		index.html
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models 🎨

🔥 News

📝 Introduction

✅ To-Do List

🛠️ Installation

🚀 Checkpoints

💻 Environment Setup

🧩 Region/Instance Controllable Rendering

🔌 Support for ControlNet (rough implementation version)

🔌 Support for SD3 (rough implementation version)

🖼️ End-to-End Layout-to-Image Generation

📦 1. Install Depth Map Extraction (Depth-Anything v2)

Step 1: Install the Depth-Anything package

Step 2: Download Model Weights

🚀 2. Run End-to-End Generation

📊 Comparison with Other Models

🙏 Acknowledgements

📚 Citation

📬 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

limuloo/DreamRenderer

Folders and files

Latest commit

History

Repository files navigation

DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models 🎨

🔥 News

📝 Introduction

✅ To-Do List

🛠️ Installation

🚀 Checkpoints

💻 Environment Setup

🧩 Region/Instance Controllable Rendering

🔌 Support for ControlNet (rough implementation version)

🔌 Support for SD3 (rough implementation version)

🖼️ End-to-End Layout-to-Image Generation

📦 1. Install Depth Map Extraction (Depth-Anything v2)

Step 1: Install the Depth-Anything package

Step 2: Download Model Weights

🚀 2. Run End-to-End Generation

📊 Comparison with Other Models

🙏 Acknowledgements

📚 Citation

📬 Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages