Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Word cloud for different languages #2194

Open
MVsevolod opened this issue Jul 3, 2020 · 2 comments
Open

Word cloud for different languages #2194

MVsevolod opened this issue Jul 3, 2020 · 2 comments

Comments

@MVsevolod
Copy link

This will be a nice feature to parse not only english, but also other languages for word cloud statistics.

@builder-247 builder-247 transferred this issue from odota/parser Jul 3, 2020
@howardchung
Copy link
Member

I think we restricted it because if we add e.g. chinese characters, the number of words goes way up and the word cloud loses a lot of meaning, so we currently filter to only words with [a-z] letters.

Maybe we could include cyrillic letters to allow languages like Russian, though.

@howardchung
Copy link
Member

For reference this is the function converting a chat message line into words:
https://github.com/odota/core/blob/master/util/utility.js#L23

howardchung pushed a commit that referenced this issue Oct 2, 2020
* Add wordcloud support for cyrillic letters (#2194)

* Update utility.js
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants