Is it possible to train NER with custom data labeled for named entities while keeping the base model's ability to do DEP and POS taggin? #4112
-
Your Environment
This is not a bug or feature request per se, but I didn't post this question on Stackoverflow because it's not specifically about programming. I have a set of data labeled for named entities. Some of these named entities overlap with the entity categories in the pre-trained English models, and some are new entity types. While my dataset is labeled for named entities, it is not annotated for either DEP or POS because I do not have the resources to tag them exhaustively in my data.. Both DEP and POS are things I need for my work. So my question is: Is it possible to train (finetune) on a pretrained model with a data set that is only labeled for NER and still retain the DEP and POS tagging? Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
Yes, this is possible by disabling all pipelines except the NER pipeline before starting the training. If you're using the CLI you want to look at the |
Beta Was this translation helpful? Give feedback.
-
@BreakBB : After training, if I would like to still use DEP and POS in conjunction with NER, what should I do? Do I just add the other pipelines back in? -- this is the part that I am not clear about. Would the resulting model suffer from the so called "catastrophic forgetting"? Thanks for the help! |
Beta Was this translation helpful? Give feedback.
-
If you're using code to disable the pipes like: with nlp.disable_pipes('tagger', 'parser'):
nlp.begin_training() Once you leave the |
Beta Was this translation helpful? Give feedback.
If you're using code to disable the pipes like:
Once you leave the
with
statement the pipes are enabled again and you can use them. So you could save your model to disk or directly run some NLP tasks with the model loaded.