Which Spacy components are required to for optimal Lemmatizer quality? #13651
Unanswered
joshdavham
asked this question in
Help: Model Advice
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm a frequent user of the Spacy library, and in particular, I use it to create frequency lists of lemmas on large data sets. Effectively, all I'm doing is taking text and getting their lemmas.
However, when using the common
the processing is too slow.
After reading the docs, I then tried
which is certainly faster, but the quality of lemmatization dropped sharply.
Clearly the answer is to add enable more components to get better, lemmatization. Perhaps something like this?
Anyways, my question is: Which Spacy compontents are required for optimal Lemmatizer quality? I'd like to disable every other component that doesn't directly contribute to better lemmatization. Unfortunately I haven't found anything in the docs :(
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions