Skip to content

l_recon and l_recon0 never converges #101

@gkv856

Description

@gkv856

Hi There,

First of all, thank you very much for putting out your work. I used it to derive my own use-case however the loss never goes down after it reaches 0.02, could you please?

I have been working on an AutoVC project. I am kinda stuck because my AutoVC model is not learning beyond a point. No matter how hard I try, I cannot get the loss below 0.02 See below.

Started Batched Training...
Started Batched Training...
Epoch:[50/1000] .......... l_recon: 0.0666, l_recon0: 0.0427, l_content: 0.0000, 00:01:12
Epoch:[100/1000] .......... l_recon: 0.0671, l_recon0: 0.0438, l_content: 0.0000, 00:01:7
Reducing learning rate from '0.001' to '0.0005'
Epoch:[150/1000] .......... l_recon: 0.0502, l_recon0: 0.0279, l_content: 0.0000, 00:01:7
Epoch:[200/1000] .......... l_recon: 0.0500, l_recon0: 0.0274, l_content: 0.0000, 00:01:8
Reducing learning rate from '0.0005' to '0.00025'
Epoch:[250/1000] .......... l_recon: 0.0462, l_recon0: 0.0239, l_content: 0.0000, 00:01:7
Epoch:[300/1000] .......... l_recon: 0.0457, l_recon0: 0.0237, l_content: 0.0000, 00:01:7
Reducing learning rate from '0.00025' to '0.000125'
Model saved as 'ckpt_epoch_300.pth'
Epoch:[350/1000] .......... l_recon: 0.0448, l_recon0: 0.0228, l_content: 0.0000, 00:01:8
Epoch:[400/1000] .......... l_recon: 0.0447, l_recon0: 0.0228, l_content: 0.0000, 00:01:7
Fixing learning rate to '0.0001'
Epoch:[450/1000] .......... l_recon: 0.0449, l_recon0: 0.0229, l_content: 0.0000, 00:01:7
Epoch:[500/1000] .......... l_recon: 0.0448, l_recon0: 0.0228, l_content: 0.0000, 00:01:7
Epoch:[550/1000] .......... l_recon: 0.0455, l_recon0: 0.0234, l_content: 0.0000, 00:01:7
Epoch:[600/1000] .......... l_recon: 0.0450, l_recon0: 0.0230, l_content: 0.0000, 00:01:7
Model saved as 'ckpt_epoch_600.pth'
Epoch:[650/1000] .......... l_recon: 0.0447, l_recon0: 0.0228, l_content: 0.0000, 00:01:8
Epoch:[700/1000] .......... l_recon: 0.0452, l_recon0: 0.0231, l_content: 0.0000, 00:01:7
Epoch:[750/1000] .......... l_recon: 0.0446, l_recon0: 0.0227, l_content: 0.0000, 00:01:7
Epoch:[800/1000] .......... l_recon: 0.0453, l_recon0: 0.0233, l_content: 0.0000, 00:01:7
Epoch:[850/1000] .......... l_recon: 0.0451, l_recon0: 0.0230, l_content: 0.0000, 00:01:7
Epoch:[900/1000] .......... l_recon: 0.0457, l_recon0: 0.0236, l_content: 0.0000, 00:01:8
Model saved as 'ckpt_epoch_900.pth'
Epoch:[950/1000] .......... l_recon: 0.0455, l_recon0: 0.0233, l_content: 0.0000, 00:01:8
Epoch:[1000/1000] .......... l_recon: 0.0444, l_recon0: 0.0226, l_content: 0.0000, 00:01:7
Model saved as 'final_1000.pth'

Because of this the final mel-spect is not as expected and hence audio conversion from that mel is not as expected. Could you please help me?

Github - https://github.com/gkv856/end2end_auto_voice_conversion
Colab - https://github.com/gkv856/KaggleData/blob/main/AutoVC_training_.ipynb

I have setup the colab notebook in such a way that you dont have to edit it. Simple edit following line with correct model path after the autovc training 'hp.m_avc.gen.best_model_path = "/content/AVC/static/model_chk_pts/autovc/final_1000.pth"'

Here are the initial and predicted mel-spects. The predicted one is definitely not the way it should be. Left one is the original audio used and the right one is the predictions of the postnet.

Capture

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions