Is it possible to train NER with custom data labeled for named entities while keeping the base model's ability to do DEP and POS taggin? #4112

H20Watermelon · 2019-08-13T01:16:42Z

H20Watermelon
Aug 13, 2019

Your Environment

Operating System: Windows 10
Python Version Used: 3.7
spaCy Version Used: 2.1.4
Environment Information:

This is not a bug or feature request per se, but I didn't post this question on Stackoverflow because it's not specifically about programming.

I have a set of data labeled for named entities. Some of these named entities overlap with the entity categories in the pre-trained English models, and some are new entity types.

While my dataset is labeled for named entities, it is not annotated for either DEP or POS because I do not have the resources to tag them exhaustively in my data.. Both DEP and POS are things I need for my work.

So my question is: Is it possible to train (finetune) on a pretrained model with a data set that is only labeled for NER and still retain the DEP and POS tagging? Thanks!

Answered by BreakBB

Aug 13, 2019

If you're using code to disable the pipes like:

with nlp.disable_pipes('tagger', 'parser'):
    nlp.begin_training()

Once you leave the with statement the pipes are enabled again and you can use them. So you could save your model to disk or directly run some NLP tasks with the model loaded.

View full answer

BreakBB · 2019-08-13T05:46:14Z

BreakBB
Aug 13, 2019

Yes, this is possible by disabling all pipelines except the NER pipeline before starting the training.

If you're using the CLI you want to look at the pipeline parameter and using code you want to have a look here.

0 replies

H20Watermelon · 2019-08-13T05:50:25Z

H20Watermelon
Aug 13, 2019
Author

@BreakBB : After training, if I would like to still use DEP and POS in conjunction with NER, what should I do? Do I just add the other pipelines back in? -- this is the part that I am not clear about. Would the resulting model suffer from the so called "catastrophic forgetting"? Thanks for the help!

0 replies

BreakBB · 2019-08-13T05:55:45Z

BreakBB
Aug 13, 2019

If you're using code to disable the pipes like:

with nlp.disable_pipes('tagger', 'parser'):
    nlp.begin_training()

Once you leave the with statement the pipes are enabled again and you can use them. So you could save your model to disk or directly run some NLP tasks with the model loaded.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to train NER with custom data labeled for named entities while keeping the base model's ability to do DEP and POS taggin? #4112

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

Is it possible to train NER with custom data labeled for named entities while keeping the base model's ability to do DEP and POS taggin? #4112

H20Watermelon Aug 13, 2019

Your Environment

Replies: 3 comments

BreakBB Aug 13, 2019

H20Watermelon Aug 13, 2019 Author

BreakBB Aug 13, 2019

H20Watermelon
Aug 13, 2019

BreakBB
Aug 13, 2019

H20Watermelon
Aug 13, 2019
Author

BreakBB
Aug 13, 2019