-
Notifications
You must be signed in to change notification settings - Fork 518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug with emoji as bookmark, Javascript SDK, eastAsia #326
Comments
Hi @branaway You can try using【speechConfig.SetProperty(PropertyId.SpeechServiceResponse_RequestPunctuationBoundary, "false"); 】to control the output of punctuation marks. After setting it to false, punctuation marks will no longer be outputted. Currently, it's enabled by default, so an extra punctuation mark is outputted here. |
well, the '>' is part of the tag, not a punctuation mark. |
the comment system of Github removed the '<' bookmark '>' tag |
I was using a bookmark tag in my test and the bookmark name is an emoji 😁, which resulted the closing right angle mark being treated as part of the audible text. |
Is you input for a real product purpose? Why do you use bookmark and word boundary event together? Can you just use one of them. Word boundary event is a better choice if both works for you. Bookmark may change readout if you put that in improper place (for example, in the middle of a word). |
In fact the emojis are created by AI chatbot. I transformed the embedded emojis to bookmarks (with the emojis as the 'mark' atrribites) and pick them up in the time series of the words while playing back the audio stream. I tend to think the server did process the emojis well, since they may take extended bytes of space and caused misaligned word boundaries. |
BTW, the server would "read" the meaning of each emojis if they are embedded in the text stream verbatim, which is something I'd rather process my self. |
So, you would like to get bookmark event for emoji. I guess you may do some post processing, like replace them with sound effect, right? |
In my case I keep track of the bookmarks and the words stream and display proper visual effects in a timely fashion. I was suggesting the developers to review the code that handles the bookmark tag particularly if it contains emojis. |
I can reproduce the issue you reported with SSML below. Will investigate on service side. |
For Unicode after U+10000, the word boundary offset returned by TTS service is wrong. This is not limited to bookmark, but to all SSML. |
<speak version='1.0' xmlns="http://www.w3.org/2001/10/synthesis\" xmlns:mstts="https://www.w3.org/2001/mstts\" xml:lang='en-US'>
物理学 <bookmark mark="😀" /> <break time="1750ms" />已经从任何事物都是“如露亦如电,应作如是观”这个方向往佛学的境界上又靠近一步了。世界上可能存在着类似灵魂的东西,它在人生结束之后不死,只是回到宇宙中的某个地方去了。这种观念跟唯识的根本-阿赖耶识学说是相一致的。
The text was updated successfully, but these errors were encountered: