AI Killed the Video Star

The the code for the papers AI Killed the Video Star and Dimitra: Audio-driven Diffusion model for Expressive Talking Head Generation

Paper | Project Page

With this code can animate face image based on an audio sequence to do talking head generation

Installation

Follow these instruction to install the code. It will require a NVIDIA gpu with more than 8 Go of memory.

git clone https://github.com/TashvikDhamija/dimitra.git
cd dimitra
pip install -r requirements.txt
cd Deep3DFaceRecon_pytorch
git clone -b 0.3.0 https://github.com/NVlabs/nvdiffrast
cd nvdiffrast
pip install .
cd ../../
mv utils.py YOURVENV/lib/python3.12/site-packages/realesrgan/
mv degradations.py YOURVENV/lib/python3.12/site-packages/basicsr/data/

Then download the weights from link an copy the content in the directory

If there are issues with packages versions try:

pip install -r requirements_noversions.txt

Inference instructions

To run a single time in 512*512 resolution use

python scripts/run_single.py --input_dir INPUTDIR --output_dir OUTPUTDIR

To run a single time in 256*256 resolution use

python scripts/run_single.py --input_dir INPUTDIR --output_dir OUTPUTDIR --res 256

To run a single time with VoxCeleb style of cropping (i.e. missing the top of the head) run

python scripts/run_single.py --input_dir INPUTDIR --output_dir OUTPUTDIR --res 256 --vox

To run a single time and clean the output video from artifacts (slower that normal generation) use

python scripts/run_single.py --input_dir INPUTDIR --output_dir OUTPUTDIR --remove_artifacts

The results will be saved in the directory you choose as Dimitra_output.mp4 (and Dimitra_output_cleaned.mp4 if removing artifacts) In the input directory the following configuration are valid:

1 .mp4 file (video recontruction from the audio)
2 .mp4 files (the first aphabeticaly will be used for identity and teh second for audio)
1 .png file and 1 .wav file
1 .mp4 file and 1 .wav file
1 .png file and 1 .mp4 file

To run inference on multiple video use (options are the same as above)

python scripts/run_multi.py --input_dir INPUTDIR --output_dir OUTPUTDIR

The output directory will have the same structure as the input directory. This support the same configuration as above (in sevreal subdirectories) in addition to the following one:

more than 2 .mp4 file (reconstruction for entire dataset)
more than 1 .mp4 and 1.png file (several audio same identity)
more than 1 .wav and 1.png file (several audio same identity)
more than 1 .wav and 1.mp4 file (several audio same identity)
more than 1 .mp4 and 1.wav file (several identity same audio)
more than 1 .png and 1.wav file (several identity same audio)
more than 1 .png and 1.mp4 file (several identity same audio)

Training instructions

Training code coming soon

Acknowledgement

This code reuse code or part of code provided by:

3DMM extraction: https://github.com/sicxu/Deep3DFaceRecon_pytorch

Video renderer: https://github.com/RenYurui/PIRender https://github.com/FuxiVirtualHuman/styletalk

Artifact removing: https://github.com/wzhouxiff/RestoreFormerPlusPlus

References

If you use our code please cite:

@article{chopin2025dimitra,
  title={Dimitra: Audio-driven Diffusion model for Expressive Talking Head Generation},
  author={Chopin, Baptiste and Dhamija, Tashvik and Balaji, Pranav and Wang, Yaohui and Dantcheva, Antitza},
  journal={arXiv preprint arXiv:2502.17198},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
Deep3DFaceRecon_pytorch		Deep3DFaceRecon_pytorch
Dimitra++		Dimitra++
Pirender		Pirender
RestoreFormer++		RestoreFormer++
checkpoints/default		checkpoints/default
extract_audio		extract_audio
images		images
input_single		input_single
scripts		scripts
static		static
.gitignore		.gitignore
README.md		README.md
degradations.py		degradations.py
index.html		index.html
requirements.txt		requirements.txt
requirements_noversions.txt		requirements_noversions.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI Killed the Video Star

Installation

Inference instructions

Training instructions

Acknowledgement

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

TashvikDhamija/dimitra

Folders and files

Latest commit

History

Repository files navigation

AI Killed the Video Star

Installation

Inference instructions

Training instructions

Acknowledgement

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages