-
Notifications
You must be signed in to change notification settings - Fork 37
Description
I followed the tutorial and ran the inference process using "pretrained_models/gopt_librispeech/best_audio_model.pth". Then I obtained u1, u2, u3, u4, u5, p, w1, w2, w3.
I noticed that the tensor values(total 50 numbers) of w1, those corresponding to actual phonemes were very small (approximately between -0. and 0.3).
Why is this? It looks like abnormal, compared to u1, u2, u3, u4, u5, p, w2, w3. (I saw similar situation about w1 in other issues where someone showed his infer results)
This is my test: 000010035.WAV from speechocean762, text is "zero three five one", which has 13 phonemes.
u1, u2, u3, u4, u5: tensor([[1.7833]]) tensor([[1.5154]]) tensor([[1.7655]]) tensor([[1.7357]]) tensor([[1.7685]])
p: tensor([[[1.1436],
[1.2331],
[1.1196],
[1.2038],
[1.2537],
[1.2149],
[1.2063],
[1.1754],
[1.1583],
[1.1923],
[1.2894],
[1.2024],
[1.1669],
[0.9675],
[1.0724],
[0.9618],
[0.9086],
[0.9618],
[0.8737],
[1.0742],
[0.9577],
[0.8407],
[0.9326],
[0.9799],
[0.9922],
[0.9353],
[0.9699],
[0.9905],
[0.9040],
[1.0046],
[0.7688],
[0.9281],
[0.8746],
[0.7709],
[0.9233],
[0.9442],
[0.8750],
[0.8773],
[1.0039],
[0.9788],
[0.9340],
[0.9868],
[0.9692],
[0.9557],
[0.9683],
[0.9823],
[0.9354],
[0.8778],
[1.0031],
[0.9373]]])
w1, w2, w3: tensor([[[0.1691],
[0.2397],
[0.1007],
[0.1991],
[0.3733],
[0.1940],
[0.2208],
[0.0934],
[0.0976],
[0.1821],
[0.2391],
[0.1987],
[0.1470],
[1.1735],
[1.2703],
[1.1686],
[1.1330],
[1.1637],
[1.0912],
[1.2776],
[1.1699],
[1.0748],
[1.1293],
[1.1984],
[1.1910],
[1.1172],
[1.1902],
[1.2180],
[1.1366],
[1.1940],
[1.0083],
[1.1308],
[1.0939],
[1.0065],
[1.1273],
[1.1631],
[1.1033],
[1.1066],
[1.2187],
[1.1881],
[1.1629],
[1.1910],
[1.1909],
[1.1718],
[1.1821],
[1.1856],
[1.1549],
[1.0991],
[1.1983],
[1.1454]]])
tensor([[[0.7173],
[0.7828],
[0.6683],
[0.7069],
[0.9740],
[0.7527],
[0.8086],
[0.6274],
[0.6307],
[0.7377],
[0.7798],
[0.7304],
[0.7170],
[1.1944],
[1.2968],
[1.2026],
[1.1574],
[1.2027],
[1.1562],
[1.3399],
[1.2438],
[1.0881],
[1.1578],
[1.2435],
[1.2504],
[1.2036],
[1.2393],
[1.2477],
[1.1619],
[1.2599],
[1.0630],
[1.2488],
[1.1355],
[1.0787],
[1.1603],
[1.1924],
[1.1407],
[1.1515],
[1.2306],
[1.1980],
[1.1690],
[1.1929],
[1.1946],
[1.1743],
[1.2053],
[1.2255],
[1.1787],
[1.1014],
[1.2296],
[1.1742]]])
tensor([[[1.0399],
[1.0698],
[0.9834],
[1.0336],
[1.1513],
[1.0586],
[1.0634],
[0.9907],
[0.9838],
[1.0626],
[1.1008],
[1.0480],
[1.0154],
[1.2130],
[1.3040],
[1.1792],
[1.1489],
[1.2104],
[1.0891],
[1.3223],
[1.2185],
[1.0932],
[1.1865],
[1.2409],
[1.2331],
[1.2000],
[1.2145],
[1.2550],
[1.1831],
[1.2687],
[1.0803],
[1.2122],
[1.1373],
[1.0713],
[1.1762],
[1.1943],
[1.1675],
[1.1700],
[1.2843],
[1.2375],
[1.1839],
[1.2346],
[1.2291],
[1.2178],
[1.2108],
[1.2471],
[1.1855],
[1.1430],
[1.2604],
[1.1935]]])
These are others:
https://github.com/YuanGongND/gopt/issues/31
https://github.com/YuanGongND/gopt/issues/11