Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about params in MTW configuration #25

Open
DatGuy1 opened this issue Nov 22, 2020 · 2 comments
Open

Question about params in MTW configuration #25

DatGuy1 opened this issue Nov 22, 2020 · 2 comments

Comments

@DatGuy1
Copy link
Contributor

DatGuy1 commented Nov 22, 2020

I've read the paper and tried to understand the source code of the original repo, but can't for the life of me understand what to set these lines to. I trained a multispeaker model with 22050Hz and the rest hparams you'd expect for that sampling rate, but I did set the num_mels to 160 therefore I can't use the pretrained hifigan models.

Creating/training a new hifigan model works well enough in tensorboard logs. In tensorboard, for validation the spectrograms look good and the audio is fairly close to GT, but whenever I try to infer speech it results in indecipherable audio. What should I set those settings to to fix this?

@CookiePPP
Copy link
Owner

CookiePPP commented Nov 22, 2020

defaults should be good. Can you give me an end-to-end example where you try to infer speech? (e.g: audio files showing how it failed?)
If you're using inference.py then give me an example file?
If you're using tacotron2 then give me your tacotron2 hparams.py?

@DatGuy1
Copy link
Contributor Author

DatGuy1 commented Nov 23, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants