-
Notifications
You must be signed in to change notification settings - Fork 217
Description
Hi There,
First of all, thank you very much for putting out your work. I used it to derive my own use-case however the loss never goes down after it reaches 0.02, could you please?
I have been working on an AutoVC project. I am kinda stuck because my AutoVC model is not learning beyond a point. No matter how hard I try, I cannot get the loss below 0.02 See below.
Started Batched Training...
Started Batched Training...
Epoch:[50/1000] .......... l_recon: 0.0666, l_recon0: 0.0427, l_content: 0.0000, 00:01:12
Epoch:[100/1000] .......... l_recon: 0.0671, l_recon0: 0.0438, l_content: 0.0000, 00:01:7
Reducing learning rate from '0.001' to '0.0005'
Epoch:[150/1000] .......... l_recon: 0.0502, l_recon0: 0.0279, l_content: 0.0000, 00:01:7
Epoch:[200/1000] .......... l_recon: 0.0500, l_recon0: 0.0274, l_content: 0.0000, 00:01:8
Reducing learning rate from '0.0005' to '0.00025'
Epoch:[250/1000] .......... l_recon: 0.0462, l_recon0: 0.0239, l_content: 0.0000, 00:01:7
Epoch:[300/1000] .......... l_recon: 0.0457, l_recon0: 0.0237, l_content: 0.0000, 00:01:7
Reducing learning rate from '0.00025' to '0.000125'
Model saved as 'ckpt_epoch_300.pth'
Epoch:[350/1000] .......... l_recon: 0.0448, l_recon0: 0.0228, l_content: 0.0000, 00:01:8
Epoch:[400/1000] .......... l_recon: 0.0447, l_recon0: 0.0228, l_content: 0.0000, 00:01:7
Fixing learning rate to '0.0001'
Epoch:[450/1000] .......... l_recon: 0.0449, l_recon0: 0.0229, l_content: 0.0000, 00:01:7
Epoch:[500/1000] .......... l_recon: 0.0448, l_recon0: 0.0228, l_content: 0.0000, 00:01:7
Epoch:[550/1000] .......... l_recon: 0.0455, l_recon0: 0.0234, l_content: 0.0000, 00:01:7
Epoch:[600/1000] .......... l_recon: 0.0450, l_recon0: 0.0230, l_content: 0.0000, 00:01:7
Model saved as 'ckpt_epoch_600.pth'
Epoch:[650/1000] .......... l_recon: 0.0447, l_recon0: 0.0228, l_content: 0.0000, 00:01:8
Epoch:[700/1000] .......... l_recon: 0.0452, l_recon0: 0.0231, l_content: 0.0000, 00:01:7
Epoch:[750/1000] .......... l_recon: 0.0446, l_recon0: 0.0227, l_content: 0.0000, 00:01:7
Epoch:[800/1000] .......... l_recon: 0.0453, l_recon0: 0.0233, l_content: 0.0000, 00:01:7
Epoch:[850/1000] .......... l_recon: 0.0451, l_recon0: 0.0230, l_content: 0.0000, 00:01:7
Epoch:[900/1000] .......... l_recon: 0.0457, l_recon0: 0.0236, l_content: 0.0000, 00:01:8
Model saved as 'ckpt_epoch_900.pth'
Epoch:[950/1000] .......... l_recon: 0.0455, l_recon0: 0.0233, l_content: 0.0000, 00:01:8
Epoch:[1000/1000] .......... l_recon: 0.0444, l_recon0: 0.0226, l_content: 0.0000, 00:01:7
Model saved as 'final_1000.pth'
Because of this the final mel-spect is not as expected and hence audio conversion from that mel is not as expected. Could you please help me?
Github - https://github.com/gkv856/end2end_auto_voice_conversion
Colab - https://github.com/gkv856/KaggleData/blob/main/AutoVC_training_.ipynb
I have setup the colab notebook in such a way that you dont have to edit it. Simple edit following line with correct model path after the autovc training 'hp.m_avc.gen.best_model_path = "/content/AVC/static/model_chk_pts/autovc/final_1000.pth"'
Here are the initial and predicted mel-spects. The predicted one is definitely not the way it should be. Left one is the original audio used and the right one is the predictions of the postnet.