Why are tokens with underscore or hyphen ignored in YakeKeywordExtraction() annotator? #9022
Unanswered
a-kliuieva
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have a Spark dataframe
input_df
:I want to extract keywords for each
id
usingYakeKeywordExtraction()
annotator.For this I use the following pipeline:
Results obtained:
It is obvious that predominant tokens
solar_system
andmilky_way
are ignored (a similar situation if a hyphen or space is used instead of an underscore).. But why and how to deal with this?Thanks a lot for any advice!
Beta Was this translation helpful? Give feedback.
All reactions