We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
在用IK分词器处理中文古籍时,发现它会自动过滤一些属于Unicode扩展区的生僻字,不知要如何解决?
以字符串“习𮊸𨻸𰄊𰶃”为例,如下:
期望这些汉字都能正确分词。
Versions: Elasticsearch 7.17.9(Docker)
The text was updated successfully, but these errors were encountered:
新PR已经解决这个问题,请更新 #1071 请验证后close这个issue
Sorry, something went wrong.
No branches or pull requests
Description
在用IK分词器处理中文古籍时,发现它会自动过滤一些属于Unicode扩展区的生僻字,不知要如何解决?
Steps to reproduce
以字符串“习𮊸𨻸𰄊𰶃”为例,如下:
Expected behavior
期望这些汉字都能正确分词。
Environment
Versions: Elasticsearch 7.17.9(Docker)
The text was updated successfully, but these errors were encountered: