Is there any experiment on Chinese data set. #91

zhaojingxin123 · 2024-08-08T02:58:59Z

May I ask if there is any experiment on Chinese data set? Why I use pinyin as phoneme training on Chinese Mandarin data set, and what I synthesize is all noise?

zhaojingxin123 · 2024-08-08T02:59:13Z

Does anyone know why that is? Or is there a Chinese data set with experimental success? What methods do you use to phoneme Chinese texts?

shivammehta25 · 2024-08-09T14:37:19Z

I am sorry, I haven't trained a Chinese dataset, but I can assure that the model training is language independent. There are forks in Krygz https://github.com/UlutSoftLLC/MamtilTTS and Catalan https://huggingface.co/projecte-aina/matxa-tts-cat-multiaccent . So perhaps someone who has trained on a Chinese dataset can chip into the conversation.

Just to confirm, did you see this page? https://github.com/shivammehta25/Matcha-TTS/wiki/Training-%F0%9F%8D%B5-Matcha%E2%80%90TTS-with-different-dataset-&-languages

zhaojingxin123 · 2024-08-13T09:20:21Z

Hello author, thank you for your anwser !!!
In addition, I am deeply sorry that I have been ill recently and have not seen your message.
there should be no big problems with your code and model.
It because I use a wrong way coding Chinese to phonemes .
Your project can indeed be applied to Chinese，But what I trained model generate wavs was very noisy，

I trained the model on a chinese dataset AISHELL3 ,119 epochs, poor reception

myconfig is:

What do you think is the reason?
1.The number of epochs trained is not enough?
2.Or because the number of spk 174 is too much?
3.each spk"s data is not enough?
4.the n_vocab: 50 of the symbols ,Is there any influence?

how can i improve the synthesis ?

shivammehta25 · 2024-08-14T17:57:53Z

I think the dataset size and training should be enough.

4.the n_vocab: 50 of the symbols ,Is there any influence?
Do you really have only 50 symbols? I feel something might be wrong here, what phonemizer are you using?

zhaojingxin123 · 2024-08-15T10:51:34Z

thank you foryour anwser ，shivammehta25。
It's not International Phonetic Alphabet (IPA), but rather Taiwanese Pinyin, a type of Chinese phoneme with 50 symbols.
the model （i trained with AISHELL3） has a bit of human voice but also contains a lot of noise.
Previously, I used the Mainland Chinese version of Pinyin, another form of Chinese phonetic notation with over 200 symbols.
I suspect that the issue might be due to wrong processing, specifically with the Mainland Chinese version of Pinyin.

TheWayLost · 2024-11-20T16:43:09Z

thank you foryour anwser ，shivammehta25。 It's not International Phonetic Alphabet (IPA), but rather Taiwanese Pinyin, a type of Chinese phoneme with 50 symbols. the model （i trained with AISHELL3） has a bit of human voice but also contains a lot of noise. Previously, I used the Mainland Chinese version of Pinyin, another form of Chinese phonetic notation with over 200 symbols. I suspect that the issue might be due to wrong processing, specifically with the Mainland Chinese version of Pinyin.

So do you mean that using the Mainland Chinese version of Pinyin instead would not cause this problem? (I am also trying to use the method on Chinese dataset, I think this method is truly interesting)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there any experiment on Chinese data set. #91

Is there any experiment on Chinese data set. #91

zhaojingxin123 commented Aug 8, 2024

zhaojingxin123 commented Aug 8, 2024

shivammehta25 commented Aug 9, 2024

zhaojingxin123 commented Aug 13, 2024

shivammehta25 commented Aug 14, 2024

zhaojingxin123 commented Aug 15, 2024

TheWayLost commented Nov 20, 2024

Is there any experiment on Chinese data set. #91

Is there any experiment on Chinese data set. #91

Comments

zhaojingxin123 commented Aug 8, 2024

zhaojingxin123 commented Aug 8, 2024

shivammehta25 commented Aug 9, 2024

zhaojingxin123 commented Aug 13, 2024

shivammehta25 commented Aug 14, 2024

zhaojingxin123 commented Aug 15, 2024

TheWayLost commented Nov 20, 2024