One annotation change corrupts entire model #7168
Replies: 13 comments 3 replies
-
Could you share some sample output? Is it just the training loss that jumps so strongly, or also the accuracy on the dev set? |
Beta Was this translation helpful? Give feedback.
-
Truthfully, I need to do some work to answer question. Once my model was mature, I streamlined some of the testing and used loss as a proxy. As my annotation set incrementally grows, the increase in accuracy was so slow as to be uninteresting. Seeing the loss jump so dramatically, and not reduce even after dozens of iterations, made me think it unlikely that other metrics would be unaffected. |
Beta Was this translation helpful? Give feedback.
-
I'll go ahead and move this to the discussion forum in the meantime, as this type of discussion is perfectly suited for that. We haven't really encountered the behaviour you described before, so it would be good to get a sample output / some more background / some reproducible code snippet to be able to look into this further! |
Beta Was this translation helpful? Give feedback.
-
Last training iteration before annotation change...loss = 12000 (seems large but huge dataset with many custom entities)
Testing on my demo sample set: These models are trained using 5k+ annotations across 1000 documents. I use the full training set for all model runs to avoid forgetting issues. My code is almost identical to that which you posted as a sample in spacy 2
|
Beta Was this translation helpful? Give feedback.
-
Are you adding a label to the model that wasn't seen before, and wasn't present during initialization? I don't remember exactly which versions were affected, but it was an extended struggle to make "live" learning of labels work well. One issue that was particularly tricky was that it turned out the NER model settled into a pattern where most of the scores were fairly large negative values. When a new label was added, the initial weights for the label were zero, and that meant that it received a score of 0 --- which would be by far the best score! So as soon as the label is added, the model predicts it like crazy, even before it's seen an example of it. This would explain the sudden spike in loss. You can avoid the issue by ensuring all the labels are added at the beginning of training. I think the issue was fixed by v2.3. |
Beta Was this translation helpful? Give feedback.
-
I am using 2.3. I posted my code and do include all labels. Since this
is a long training model, I also ensure that any new labels are added.
The model vocabulary has grown without issue, except for the edge
condition where a token is redefined or sub-divided. Even then, I
recognize the reality of stepping back to move forward, but neither the
expected forgetting or learning is taking place.
I have not seen any reported negative losses. Imagine a scenario where
token-start 0 to token-end 50 is named "fintable" and trained
extensively. We then decide we wish to remove "fintable" annotation and
replace with "NOI" and token 12-20. NOI may or may not be new provision.
This blows things up.
I've seen the same issue when consolidating token names. Suppose I have
20 entities labeled "building_count", and 5 labeled
"property_structures". If I change the 5 to "building_count" and
eliminate that tag, bad things happen.
I would love to get some guidance into any parameters or pipeline config
that I am using to cause this. I hate the idea of keeping a training
audit table and build in logic to exclude anything that used to overlap
with something else.
…On 2/23/2021 2:06 AM, Matthew Honnibal wrote:
If I remove a single annotation, and replace it with one or more
different labels that overlap with the same token-start and end,
my models, on incremental training paths, will show losses
increased by 100-200x... 5x larger that a clean model at start.
Are you adding a label to the model that wasn't seen before, and
wasn't present during initialization?
I don't remember exactly which versions were affected, but it was an
extended struggle to make "live" learning of labels work well. One
issue that was particularly tricky was that it turned out the NER
model settled into a pattern where most of the scores were fairly
large negative values. When a new label was added, the initial weights
for the label were zero, and that meant that it received a score of 0
--- which would be by far the best score! So as soon as the label is
added, the model predicts it like crazy, even before it's seen an
example of it. This would explain the sudden spike in loss.
You can avoid the issue by ensuring all the labels are added at the
beginning of training. I think the issue was fixed by v2.3.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#7168 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ARP5DX7EWF4CVYEHCTWCCLLTANHYTANCNFSM4YA6VNFA>.
|
Beta Was this translation helpful? Give feedback.
-
In v2.3 there is still a weird bug if you add additional labels to a trained model, see #6525 (comment). The simplest workaround is to save the model to disk and reload before training. |
Beta Was this translation helpful? Give feedback.
-
Are you saying that after I load the labels to the pipeline, save the
model before nlp.begin...?
I am not sure this is the same issue. The labels I am using and moving
between have been in the model since inception. I am doing a
consolidation or sometimes a drill down.
Semantically: "Sources and Uses" precedes a simple table and model has
been able to discover with decent success. One of the rows in sources in
uses might be "Equity" followed by a float. In broader context, this too
has good success. Perhaps this is a source of issue since any given span
can only belong to one label?
The other example is where I am removing a label, and moving a small set
to another existing one.
Thanks
…On 2/23/2021 6:56 AM, Adriane Boyd wrote:
In v2.3 there is still a weird bug if you add additional labels to a
trained model, see #6525 (comment)
<#6525 (comment)>.
The simplest workaround is to save the model to disk and reload before
training.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#7168 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ARP5DX33KHAXLVY5AFAECEDTAOJWLANCNFSM4YA6VNFA>.
|
Beta Was this translation helpful? Give feedback.
-
No, these actions reflect refinement of training annotations. I am using
custom pdf viewer to select text and then translate to token offsets. In
this case, a larger block of text with label X has been deleted and a
smaller block of text contained within the previous offsets has been
added with label Y.
…On 2/23/2021 7:06 AM, Matthew Honnibal wrote:
I am using 2.3. I posted my code and do include all labels. Since this
is a long training model, I also ensure that any new labels are added.
Sorry, I do see that now.
Imagine a scenario where token-start 0 to token-end 50 is named
"fintable" and trained extensively. We then decide we wish to
remove "fintable" annotation and replace with "NOI" and token
12-20. NOI may or may not be new provision.
I've seen the same issue when consolidating token names. Suppose I
have 20 entities labeled "building_count", and 5 labeled
"property_structures". If I change the 5 to "building_count" and
eliminate that tag
Apologies, but I just don't think I follow what you mean here. Could
you provide example code? These operations aren't in your code above,
right?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#7168 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ARP5DX4JRATRWTX7VPSTXKTTAOK33ANCNFSM4YA6VNFA>.
|
Beta Was this translation helpful? Give feedback.
-
Everything is independent. Text is unchanging..other annotations are
unchanging..I also have a tool that tests annotation text vs
text(token_start:token_end] and this shows no issues.
…On 2/23/2021 7:23 AM, Matthew Honnibal wrote:
Are you sure you're not messing up the offsets somehow when you do
this, for instance the offsets of other entities?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#7168 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ARP5DX5YVIVGB5E424KRWXTTAOM5VANCNFSM4YA6VNFA>.
|
Beta Was this translation helpful? Give feedback.
-
I'm not sure it is related, but it is something I will address. I add
labels before training...Easy to tell if new one is added...Would I then
save model AND null out model and reload, or is saving sufficient?
…On 2/23/2021 7:25 AM, Adriane Boyd wrote:
It might be a separate issue, but existing model +
|ner.add_label(new_label)| + immediate training leads to weird losses
and bad performance. If your script always start from
|spacy.blank("en")| then I don't think this is related, but if you are
starting from an existing model loaded with |spacy.load()| and
|add_label| is actually adding a new labels, then this could be part
of what's going on.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#7168 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ARP5DX4FAASHDNJX3UJAWSDTAOND3ANCNFSM4YA6VNFA>.
|
Beta Was this translation helpful? Give feedback.
-
I added a little code to save and reload model if new labels were
present and then ran a single iteration for training. I guess the
results are promising...Instead of losses increasing by 100x, they are
only increasing by 5x. I'll need to research further, but this seems
promising. Not sure if it explains all behavior, but it certainly helps.
…On 2/23/2021 7:25 AM, Adriane Boyd wrote:
It might be a separate issue, but existing model +
|ner.add_label(new_label)| + immediate training leads to weird losses
and bad performance. If your script always start from
|spacy.blank("en")| then I don't think this is related, but if you are
starting from an existing model loaded with |spacy.load()| and
|add_label| is actually adding a new labels, then this could be part
of what's going on.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#7168 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ARP5DX4FAASHDNJX3UJAWSDTAOND3ANCNFSM4YA6VNFA>.
|
Beta Was this translation helpful? Give feedback.
-
More work to do, but I think Adriane's suggestion has addressed the
issue. While initial losses are much larger than I would expect, they
quickly resolve themselves down to a more reasonable level. Thanks very
much for your insights into this issue. I am still not sure what
combination of factors is triggering this, but the pre-training save
seems to have broken the event chain. I will continue to test and revert
if I can recreate outside of this solution
…On 2/23/2021 7:25 AM, Adriane Boyd wrote:
It might be a separate issue, but existing model +
|ner.add_label(new_label)| + immediate training leads to weird losses
and bad performance. If your script always start from
|spacy.blank("en")| then I don't think this is related, but if you are
starting from an existing model loaded with |spacy.load()| and
|add_label| is actually adding a new labels, then this could be part
of what's going on.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#7168 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ARP5DX4FAASHDNJX3UJAWSDTAOND3ANCNFSM4YA6VNFA>.
|
Beta Was this translation helpful? Give feedback.
-
Hello,
I have a custom named entity model with 000's of annotations that has been incrementally trained for months. I am using spacy2.2, python3 and my own training script originally based on your documentation.
If I remove a single annotation, and replace it with one or more different labels that overlap with the same token-start and end, my models, on incremental training paths, will show losses increased by 100-200x... 5x larger that a clean model at start.
Since I have little control over changes that come from client text annotations, this seems like an unexpected outcome. I could see where changes and subsequent relearning would require some time, but a single change seems to corrupt everything.
Beta Was this translation helpful? Give feedback.
All reactions