Skip to content

How the model is retrain by spacy? #4008

Discussion options

You must be logged in to vote

If you're really only using 1 or 2 data sets as TRAIN_DATA the problem lies in there. The NER pipline is more than just a advanced regex therefore you will need more input data to train it. The docs of Training an additional entity type say:

To keep the example short and simple, only a few sentences are provided as examples. In practice, you’ll need many more — a few hundred would be a good start. You will also likely need to mix in examples of other entity types, which might be obtained by running the entity recognizer over unlabelled sentences, and adding their annotations to the training set.

Another thing is that you don't just have to give spaCy examples of the new entities, but of…

Replies: 5 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by ines
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
training Training and updating models
3 participants
Converted from issue

This discussion was converted from issue #4008 on December 10, 2020 14:05.