Skip to content

BrightGu/RLVC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

RLVC

In this study, we depart from the reliance on extensive pre-trained models for feature representation or mutual information minimization for diverse feature decoupling. Instead, we revisit decoupling methods based on instance normalization. To achieve this, we introduce a novel feature coupling module named cross-adaptive instance normalization (CAIN), which extends the concept of adaptive instance normalization (AdaIN). Beyond offering style injection capabilities similar to AdaIN, CAIN is explicitly designed to maintain content consistency by reconstructing frame-level statistics in mel-spectrograms. The results indicate that CAIN, serving as a lightweight plugin, significantly improves conventional instance normalization-driven approaches. Building upon this, we introduce RLVC, which achieves robust performance with a mere 5.29M parameters. For the audio samples, please refer to our demo page.

Envs

python=3.7+

You can install the dependencies with

pip install -r requirements.txt

Vocoder

The HiFi-GAN vocoder is employed to convert log mel-spectrograms to waveforms. The model is trained on universal datasets with 13.93M parameters. Please edit the path of hifigan model in "./hifivoice/inference_e2e.py".

Infer

You can download the pretrained model, and then edit "./Modu/infer/infer_config.yaml".Test Samples could be organized as "wav22050/*.wav".

python ./Modu/infer/infer_base_batch.py

Or you can access "./Modu/infer_samples.py" for the source and target speeches specified by yourself.

Train from scratch

Preprocessing

The corpus should be organized as "VCTK22050/$figure$/*.wav", and then edit the "train_wav_dir" and "out_dir" in file "./Modu/predata/robust_mels.py". The output "figure_label_mel_map.pkl" will be used for training.

python Modu/predata/robust_mels.py

Training

Please edit the path "test_wav_dir" and "label_clip_mel_pkl" for evaluation and train corpus in config file "./Modu/config.yaml".

python Modu/solver.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages