Paper: HawkI: Homography & Mutual Information Guidance for 3D-free Single Image to Aerial View (arXiv March 2024)
Please cite our paper if you find it useful.
@article{kothandaraman2023aerialbooth,
title={AerialBooth: Mutual Information Guidance for Text Controlled Aerial View Synthesis from a Single Image},
author={Kothandaraman, Divya and Zhou, Tianyi and Lin, Ming and Manocha, Dinesh},
journal={arXiv preprint arXiv:2311.15478},
year={2023}
}
Datasets: The datasets, AerialBooth-Real and AerialBooth-Syn datasets can be found in the ./dataset/ folder.
Models: The pytorch code for the models are available in the ./models/ folder.
models/aerialbooth - Model definition for AerialBooth
models/aerialbooth_viewarg - Provides support for generating any arbitrary text-controlled view
models/mutual_information - functions for computation of mutual information and earthmovers' distance
models/aerialdiffusion_lora - Model definition for Aerial Diffusion LoRA
models/dreambooth_lora - Model definition for DreamBooth LoRA
models/imagic - Model definition for Imagic LoRA
Training scripts:
Use train_aerialbooth_batch.py to perform optimization and generate the aerial-view image of a given input image.
Use train_aerialbooth_view.py to perform optimization and generate the arbitrary text-controlled views of a given input image.
Computing the quantitative metrics:
Use eval_metrics_best_batch to compute the CLIP, SSCD and DINO scores of the generated images.
torch
cv2
diffusers
numpy
scipy
accelerate
packaging
transformers
This codebase is heavily borrowed from https://github.com/huggingface/diffusers/blob/main/examples/community/imagic_stable_diffusion.py.