RLVC

In this study, we depart from the reliance on extensive pre-trained models for feature representation or mutual information minimization for diverse feature decoupling. Instead, we revisit decoupling methods based on instance normalization. To achieve this, we introduce a novel feature coupling module named cross-adaptive instance normalization (CAIN), which extends the concept of adaptive instance normalization (AdaIN). Beyond offering style injection capabilities similar to AdaIN, CAIN is explicitly designed to maintain content consistency by reconstructing frame-level statistics in mel-spectrograms. The results indicate that CAIN, serving as a lightweight plugin, significantly improves conventional instance normalization-driven approaches. Building upon this, we introduce RLVC, which achieves robust performance with a mere 5.29M parameters. For the audio samples, please refer to our demo page.

Envs

python=3.7+

You can install the dependencies with

pip install -r requirements.txt

Vocoder

The HiFi-GAN vocoder is employed to convert log mel-spectrograms to waveforms. The model is trained on universal datasets with 13.93M parameters. Please edit the path of hifigan model in "./hifivoice/inference_e2e.py".

Infer

You can download the pretrained model, and then edit "./Modu/infer/infer_config.yaml".Test Samples could be organized as "wav22050/*.wav".

python ./Modu/infer/infer_base_batch.py

Or you can access "./Modu/infer_samples.py" for the source and target speeches specified by yourself.

Train from scratch

Preprocessing

The corpus should be organized as "VCTK22050/$figure$/*.wav", and then edit the "train_wav_dir" and "out_dir" in file "./Modu/predata/robust_mels.py". The output "figure_label_mel_map.pkl" will be used for training.

python Modu/predata/robust_mels.py

Training

Please edit the path "test_wav_dir" and "label_clip_mel_pkl" for evaluation and train corpus in config file "./Modu/config.yaml".

python Modu/solver.py

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Modu		Modu
hifivoice		hifivoice
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RLVC

Envs

Vocoder

Infer

Train from scratch

Preprocessing

Training

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RLVC

Envs

Vocoder

Infer

Train from scratch

Preprocessing

Training

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages