Skip to content

FlowAlign/FlowAlign

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FlowAlign: Trajectory-Regularized, Inversion-Free Flow-based Image Editing

This repository is the official implementation of FlowAlign, an inversion & training free image editing algorithm.

concept

Abstract

💡 Recent inversion-free, flow-based editors leverage models like Stable Diffusion 3 to enable text-driven image editing via ODE integration.

🤔 However, skipping latent inversion often leads to unstable trajectories and poor source consistency.

🚀 FlowAlign addresses this by introducing a flow-matching loss—a simple yet effective regularizer that ensures smooth, semantically aligned, and structurally consistent edits.

🌟 Thanks to its ODE-based formulation, FlowAlign naturally supports reverse editing, highlighting its reversible and robust transformation capability.

Requirements

Clone this repo:

git clone https://github.com/FlowAlign/FlowAlign.git
cd FlowAlign

To install requirements:

conda create -n flowalign python==3.11
conda activate flowalign
pip install torch==2.1.2+cu118 torchvision==0.16.2+cu118 --extra-index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt

Quick Start

For the text-based image editing, run:

Examples 1

python run_edit.py \
  --img_path "samples/bicycle.jpg" \
  --src_prompt "a slanted mountain bicycle on the road in front of a building" \
  --tgt_prompt "a slanted rusty mountain bicycle on the road in front of a building"

The expected result:

result

Example 2

python run_edit.py \
  --img_path "samples/cat.jpg" \
  --src_prompt "a opened eyes cat sitting on wooden floor" \
  --tgt_prompt "a closed eyes cat sitting on wooden floor"

The expected result:

result2

How to choose editing methods

You can freely change the editing method using arguments:

  • method : dual / sdedit / flowedit / flowalign

Efficient inference

If you use --efficient_memory, text encoder will pre-compute text embeddings and is removed from the GPU.

This allows us to run image editing with a single GPU with VRAM 24GB.

Reproducibility

All edited images were generated on a single NVIDIA RTX 3090 GPU, using a fixed random seed of 123 and a Classifier-Free Guidance (CFG) scale of 13.5.

About

Official repository of FlowAlign

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages