Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I use this code in Chinese #1

Open
paizhongxing opened this issue Aug 15, 2023 · 1 comment
Open

How do I use this code in Chinese #1

paizhongxing opened this issue Aug 15, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@paizhongxing
Copy link

Hello, is there any plan to support Chinese for this code in the future? If I want to make this code usable in Chinese, how should I change it? Thank you for your answer

@NC0DER NC0DER added the enhancement New feature or request label Aug 16, 2023
@NC0DER
Copy link
Owner

NC0DER commented Aug 16, 2023

Greetings,

LMRank relies on a technique called syntax dependency parsing to extract better candidate keyphrases. This technique utilizes the noun_chunks syntax iterator component of the spaCy small model for each supported language. As of spaCy version 3.6, when I tried to integrate the small Chinese spaCy model in LMRank, the following exception is raised:

  • [E894] The 'noun_chunks' syntax iterator is not implemented for language 'zh'.

The best way of adding support for the Chinese Language in LMRank, would be to raise an issue in the official repository of spaCy. This issue would state that you request the implementation of this feature. When it is added in a future version you could message me here, so I can test this and add support for the Chinese Language.

The minimum reproducible code example that raises the exception is:

import spacy 

nlp = spacy.load('zh_core_web_sm')
text = '人工智能(英語:artificial intelligence,缩写为AI)亦稱機器智能,指由人製造出來的機器所表現出來的智慧。通常人工智能是指通过普通電腦程式來呈現人類智能的技術。'
doc = nlp(text)
print(list(doc.noun_chunks))

Thank you for your time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants