Skip to content

QuanjianSong/UniVST

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

UniVST: A Unified Framework for Training-free Localized Video Style Transfer [Official Code of PyTorch]

1 Key Laboratory of Multimedia Trusted Perception and Efficient Computing,
Ministry of Education of China, Xiamen University, China.
2 Kunlun Skywork AI.   † Corresponding Author.

Paper PDF     Project Page     Hugging Face    

πŸŽ‰ News

β€’ 2025.10: πŸ”₯ UniVST now supports five backbones, including advanced rectified-flow models.
β€’ 2025.09: πŸ”₯ The code has been reorganized and several bugs have been fixed.
β€’ 2025.05: πŸ”₯ The project page of UniVST is now available.
β€’ 2025.01: πŸ”₯ The official code of UniVST has been released.
β€’ 2024.10: πŸ”₯ The paper of UniVST has been submitted to arXiv.

🎬 Overview

We propose UniVST, a unified framework for training-free localized video style transfer based on diffusion models. UniVST first applies DDIM inversion to the original video and style image to obtain their initial noise and integrates Point-Matching Mask Propagation to generate masks for the object regions. It then performs AdaIN-Guided Localized Video Stylization with a three-branch architecture for information interaction. Moreover, Sliding-Window Consistent Smoothing is incorporated into the denoising process, enhancing the temporal consistency in the latent space. The overall framework is illustrated as follows: Overall Framework

πŸ”§ Environment

git clone https://github.com/QuanjianSong/UniVST.git
# Installation with the requirement.txt
conda create -n UniVST python=3.10
conda activate UniVST
pip install -r requirements.txt
# Or installation with environment.yaml
conda env create -f environment.yaml

πŸš€ Start

We provide five different backbone options: SD-v1.5, SD-v2.1, Animatediff-v2, SD-v3.0, and SD-v3.5. You can freely choose the backbone for your video stylization tasks.

SD-v1.5/SD-v2.1

You can run with a single click sh scripts/start_sd.sh to get the stylized results. Alternatively, you can also follow the steps below for customization.

β€’ 1.Perform inversion for original video.

CUDA_VISIBLE_DEVICES=1 python src/sd/run_content_inversion_sd.py \
                        --content_path examples/contents/mallard-fly \
                        --output_path results/contents-inv \
                        --is_opt

Then, you will find the content inversion result in the results/contents-inv/sd/mallard-fly.

β€’ 2.Perform inversion for style image.

CUDA_VISIBLE_DEVICES=1 python src/sd/run_style_inversion_sd.py \
                        --style_path examples/styles/00033.png \
                        --output_path results/styles-inv

Then, you will find the style inversion result in the results/styles-inv/sd/00033.

β€’ 3.Perform mask propagation. [Optional, you can also customize the masks and skip this step.]

CUDA_VISIBLE_DEVICES=1 python src/mask_propagation.py \
                        --feature_path results/contents-inv/sd/mallard-fly/features/inversion_feature_map_2_block_301_step.pt \
                        --backbone 'sd' \
                        --mask_path 'examples/masks/mallard-fly.png' \
                        --output_path 'results/masks'

Then, you will find the mask propagation result in the results/masks/sd/mallard-fly.

β€’ 4.Perform localized video style transfer. [Optional, you can also omit the mask_path to complete the overall style transfer.]

CUDA_VISIBLE_DEVICES=1 python src/sd/run_video_style_transfer_sd.py \
                        --content_inv_path results/contents-inv/sd/mallard-fly/inversion \
                        --style_inv_path results/styles-inv/sd/00033/inversion \
                        --mask_path results/masks/sd/mallard-fly \
                        --output_path results/stylizations

Then, you will find the stylization result in the results/stylizations/sd/mallard-fly_00033.

Animatediff-v2

First, you need to download the motion module to the dir ckpts.

Then, you can run with a single click sh scripts/start_animatediff.sh to get the stylized results. Alternatively, you can also follow the steps below for customization.

β€’ 1.Perform inversion for original video.

CUDA_VISIBLE_DEVICES=1 python src/animatediff/run_content_inversion_animatediff.py \
                        --content_path examples/contents/mallard-fly \
                        --output_path results/contents-inv \
                        --is_opt

Then, you will find the content inversion result in the results/contents-inv/animatediff/mallard-fly.

β€’ 2.Perform inversion for style image.

CUDA_VISIBLE_DEVICES=1 python src/animatediff/run_style_inversion_animatediff.py \
                        --style_path examples/styles/00033.png \
                        --output_path results/styles-inv \

Then, you will find the style inversion result in the results/styles-inv/animatediff/00033.

β€’ 3.Perform mask propagation. [Optional, you can also customize the masks and skip this step.]

CUDA_VISIBLE_DEVICES=1 python src/mask_propagation.py \
                        --feature_path results/contents-inv/animatediff/mallard-fly/features/inversion_feature_map_2_block_301_step.pt \
                        --backbone 'animatediff' \
                        --mask_path 'examples/masks/mallard-fly.png' \
                        --output_path 'results/masks'

Then, you will find the mask propagation result in the results/masks/animatediff/mallard-fly.

β€’ 4.Perform localized video style transfer. [Optional, you can also omit the mask_path to complete the overall style transfer.]

CUDA_VISIBLE_DEVICES=1 python src/animatediff/run_video_style_transfer_animatediff.py \
                        --content_inv_path results/contents-inv/animatediff/mallard-fly/inversion \
                        --style_inv_path results/styles-inv/animatediff/00033/inversion \
                        --mask_path results/masks/animatediff/mallard-fly \
                        --output_path results/stylizations

Then, you will find the stylization result in the results/stylizations/animatediff/mallard-fly_00033.

SD-v3.0/SD-v3.5

You can run with a single click sh scripts/start_sd3.sh to get the stylized results. Alternatively, you can also follow the steps below for customization.

β€’ 1.Perform inversion for original video.

CUDA_VISIBLE_DEVICES=1 python src/sd3/run_content_inversion_sd3.py \
                        --content_path examples/content/mallard-fly \
                        --output_path results/content-inv \
                        --is_rf_solver

Then, you will find the content inversion result in the results/content-inv/sd3/mallard-fly.

β€’ 2.Perform inversion for style image.

CUDA_VISIBLE_DEVICES=1 python src/sd3/run_style_inversion_sd3.py \
                        --style_path examples/style/00033.png \
                        --output_path results/style-inv \
                        --is_rf_solver # use rf_solver

Then, you will find the style inversion result in the results/style-inv/sd3/00033.

β€’ 3.Perform mask propagation. [Optional, you can also customize the masks and skip this step.]

CUDA_VISIBLE_DEVICES=1 python src/mask_propagation.py \
                        --feature_path results/content-inv/sd3/mallard-fly/features/inversion_feature_map_2_block_301_step.pt \
                        --backbone 'sd3' \
                        --mask_path 'examples/mask/mallard-fly.png' \
                        --output_path 'results/masks'

Then, you will find the mask propagation result in the results/masks/sd3/mallard-fly.

β€’ 4.Perform localized video style transfer. [Optional, you can also omit the mask_path to complete the overall style transfer.]

CUDA_VISIBLE_DEVICES=1 python src/animatediff/run_video_style_transfer_animatediff.py \
                        --content_inv_path results/content-inv/animatediff/mallard-fly/inversion \
                        --style_inv_path results/style-inv/animatediff/00033/inversion \
                        --mask_path results/masks/animatediff/mallard-fly \
                        --output_path results/stylization

Then, you will find the stylization result in the results/stylization/sd3/mallard-fly_00033.

πŸŽ“ Bibtex

πŸ€— If you find this code helpful for your research, please cite:

@article{song2024univst,
  title={UniVST: A Unified Framework for Training-free Localized Video Style Transfer},
  author={Song, Quanjian and Lin, Mingbao and Zhan, Wengyi and Yan, Shuicheng and Cao, Liujuan and Ji, Rongrong},
  journal={arXiv preprint arXiv:2410.20084},
  year={2024}
}

About

Official Pytorch Code of the Paper "UniVST: A Unified Framework for Training-free Localized Video Style Transfer"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published