We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When setting text_frontend=True (or leaving it as the default), the < and > tags are removed from the text during inference.
text_frontend=True
<
>
For example:
INFO synthesis text 这也strong太strong离谱了吧!
from cosyvoice.cli.cosyvoice import CosyVoice, CosyVoice2 from cosyvoice.utils.file_utils import load_wav import torchaudio # Initialize the CosyVoice2 model cosyvoice = CosyVoice2( 'pretrained_models/CosyVoice2-0.5B', load_jit=True, load_onnx=False, load_trt=False ) audio_file_path = 'audio/48k.wav' prompt_speech_16k = load_wav(audio_file_path, 16000) for i, j in enumerate(cosyvoice.inference_cross_lingual( '这也<strong>太</strong>离谱了吧!', prompt_speech_16k, stream=False )): torchaudio.save( 'fine_grained_control_{}.wav'.format(i), j['tts_speech'], cosyvoice.sample_rate )
The inference should preserve the < and > tags as part of the input text.
这也<strong>太</strong>离谱了吧!
The tags < and > are removed, resulting in:
这也strong太strong离谱了吧
Please provide details about your environment:
conda
The text was updated successfully, but these errors were encountered:
interesting .. maybe just a display thing as does work - on cross lingual that is
Sorry, something went wrong.
@darkacorn I reinstalled the Cosyvoice2 step by step on a new Ubuntu-based machine, and I got the log below.
2024-12-20 15:17:49,499 INFO synthesis text 这也<strong>太</strong>。
In terms of actual inference, the model uses the post-processed string, which indicates an incorrect string, and outputs the wrong audio.
The reason of re-installing is that I wanted to see if WeTextProcessing works properly, and it works fine for the above sentence.
WeTextProcessing
Anyway, I will set the text_frontend to False and leave this issue open for a while to see if someone encounters a similar issue.
text_frontend
False
No branches or pull requests
Bug Report
Description
When setting
text_frontend=True
(or leaving it as the default), the<
and>
tags are removed from the text during inference.For example:
To Reproduce
Expected Behavior
The inference should preserve the
<
and>
tags as part of the input text.Actual Behavior
The tags
<
and>
are removed, resulting in:Environment
Please provide details about your environment:
conda
The text was updated successfully, but these errors were encountered: