director-diffusion

Try the Director-Diffusion Gradio App →

Director-Diffusion is an open-source package to train Low Rank Adaptations (LoRAs) of the Flux1.Krea-dev model to fit the style of famous directors. The directors chosen here are Christopher Nolan, Martin Scorsese, Wes Anderson, Denis Villeneuve, and David Fincher, but the code is broadly applicable. They are so chosen for their unique styles and my personal affinity for their work.

Each LoRA each took ~11 H200 hours to train, and is optimized in a number of ways, including but not limited to: VAE caching, image interpolation, optimized attention via xformers, torch.compile(), Cosine LR annealing. Captioning took around 45 H200 minutes (done in parallel). Models and model cards are given below:

Nolan:
Villeneuve:
Anderson:
Fincher:
Scorsese:

Results

The following comparisons showcase the dramatic difference between the base Flux-Krea model and each director-specific LoRA. Each pair uses the same prompt to highlight the unique stylistic transformations.

Christopher Nolan

Prompt: "Two spy agents running up a stairwell, tense"

Base Model	Nolan LoRA

Denis Villeneuve

Prompt: "A vast desert landscape with mysterious structures"

Base Model	Villeneuve LoRA

Wes Anderson

Prompt: "A group of foxes and badgers in an underground tunnel looking down at a patient, in Wes Anderson's style"

Base Model	Anderson LoRA

David Fincher

Prompt: "Urban decay around detectives walking"

Base Model	Fincher LoRA

Martin Scorsese

Prompt: "A gritty street scene with dynamic camera angles"

Base Model	Scorsese LoRA

Roadmap

Image Collection and Data Verification
Caption data and verify performance of script
Train LoRAs on single director
Train Multi Director LoRA
Serve model off Gradio App
Complete evaluation using metrics
Surface metrics from blind voting
Create LCM distillation of each LoRA

Getting Started

To run the code, sync the uv environment (uv sync) and run the following commands:

captioning: uv run modal run -m src.caption. You can then upload the images to remote modal storage using modal volume.
train: uv run modal run -m src.train. You will likely need to add the detach flag as well to ensure that the train run does not get limited by session length.
serve: uv run modal serve -m src.serve (for local dev).
deploy: uv run modal deploy -m src.serve

Uses

Flux 1 - Krea dev (black-forest-labs/FLUX.1-Krea-dev) as the base model for training (chosen over Flux1.dev for its realism, I like this article for further reading)
uv for package management
ruff for code quality
ty for type checking
modal for infrastructure
shotdeck (https://shotdeck.com/welcome/home) for training stills and data (chosen over ffmpeg due to higher fidelity and wider variety of shots)
Qwen 2.5VL - 3B for image captioning (chosen over BLIP-2 and others for prompt faithfulness and ability to perceive style)

Dataset

Images were collected from shotdeck.com. Director counts for each image are listed below:

Anderson: 201 images
Fincher: 214 images
Nolan: 232 images
Scorsese: 215 images
Villeneuve: 197 images

Evaluation Results

CLIP Similarity: −0.3% average change Top performer: Villeneuve
Aesthetic Score: +4.2% average improvement Top performer: Fincher
Style Score: +0.7% average improvement Top performer: Villeneuve

Questions?

If you see an error or bug, please open an issue on GitHub, or better yet, open a pull request! I welcome contributions. For feedback / ideas please email me at me@karansampath.com.

Acknowledgments

I'd like to thank Alec Powell and the team at Modal for their support of this work through GPU credits. Also, thank you to the team at Astral for their great open-source work!

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github/workflows		.github/workflows
assets		assets
scripts		scripts
src		src
tests		tests
.DS_Store		.DS_Store
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
ruff.toml		ruff.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

director-diffusion

Results

Christopher Nolan

Denis Villeneuve

Wes Anderson

David Fincher

Martin Scorsese

Roadmap

Getting Started

Uses

Dataset

Evaluation Results

Questions?

Acknowledgments

Further Reading

About

Uh oh!

Languages

License

karansampath/director-diffusion

Folders and files

Latest commit

History

Repository files navigation

director-diffusion

Results

Christopher Nolan

Denis Villeneuve

Wes Anderson

David Fincher

Martin Scorsese

Roadmap

Getting Started

Uses

Dataset

Evaluation Results

Questions?

Acknowledgments

Further Reading

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages