How do I use this code in Chinese #1

paizhongxing · 2023-08-15T17:32:44Z

Hello, is there any plan to support Chinese for this code in the future? If I want to make this code usable in Chinese, how should I change it? Thank you for your answer

NC0DER · 2023-08-16T15:15:24Z

Greetings,

LMRank relies on a technique called syntax dependency parsing to extract better candidate keyphrases. This technique utilizes the noun_chunks syntax iterator component of the spaCy small model for each supported language. As of spaCy version 3.6, when I tried to integrate the small Chinese spaCy model in LMRank, the following exception is raised:

[E894] The 'noun_chunks' syntax iterator is not implemented for language 'zh'.

The best way of adding support for the Chinese Language in LMRank, would be to raise an issue in the official repository of spaCy. This issue would state that you request the implementation of this feature. When it is added in a future version you could message me here, so I can test this and add support for the Chinese Language.

The minimum reproducible code example that raises the exception is:

import spacy 

nlp = spacy.load('zh_core_web_sm')
text = '人工智能（英語：artificial intelligence，缩写为AI）亦稱機器智能，指由人製造出來的機器所表現出來的智慧。通常人工智能是指通过普通電腦程式來呈現人類智能的技術。'
doc = nlp(text)
print(list(doc.noun_chunks))

Thank you for your time.

NC0DER added the enhancement New feature or request label Aug 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How do I use this code in Chinese #1

How do I use this code in Chinese #1

paizhongxing commented Aug 15, 2023

NC0DER commented Aug 16, 2023 •

edited

Loading

How do I use this code in Chinese #1

How do I use this code in Chinese #1

Comments

paizhongxing commented Aug 15, 2023

NC0DER commented Aug 16, 2023 • edited Loading

NC0DER commented Aug 16, 2023 •

edited

Loading