Skip to content
This repository has been archived by the owner on Dec 14, 2020. It is now read-only.

Failed loading model: Could not find key /_1 in the model file #79

Open
danielhers opened this issue Aug 9, 2019 · 0 comments
Open
Assignees
Labels

Comments

@danielhers
Copy link
Owner

danielhers commented Aug 9, 2019

@OfirArviv, this is an issue we've been talking about a bit. I guess it could be called a DyNet issue but I'll try working around it.

In the mrp branch, whenever I train a model without BERT, when I try to load it I get this error:

...
[dynet] 2.1
Loading from 'test_files/models/ucca.enum'... Done (0.000s).
Loading model from 'test_files/models/ucca':   0%|          | 0/13 [00:00<?, ?param/s]Traceback (most recent call last):
  File "tupa/tupa/model.py", line 234, in load
    self.classifier.load(self.filename)
  File "tupa/tupa/classifiers/classifier.py", line 125, in load
    self.load_model(filename, d)
  File "tupa/tupa/classifiers/nn/neural_network.py", line 474, in load_model
    values = self.load_param_values(filename, d)
  File "tupa/classifiers/nn/neural_network.py", line 503, in load_param_values
    desc="Loading model from '%s'" % filename, unit="param"))
  File "tupa/lib/python3.7/site-packages/tqdm/_tqdm.py", line 1005, in __iter__
    for obj in iterable:
  File "_dynet.pyx", line 450, in load_generator
  File "_dynet.pyx", line 453, in _dynet.load_generator
  File "_dynet.pyx", line 327, in _dynet._load_one
  File "_dynet.pyx", line 1482, in _dynet.ParameterCollection.load_lookup_param
  File "_dynet.pyx", line 1497, in _dynet.ParameterCollection.load_lookup_param
RuntimeError: Could not find key /_1 in the model file

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "tupa/tupa/parse.py", line 653, in <module>
    main()
  File "tupa/tupa/parse.py", line 649, in main
    list(main_generator())
  File "tupa/tupa/parse.py", line 631, in main_generator
    yield from train_test(test=args.input, args=args)
  File "tupa/tupa/parse.py", line 557, in train_test
    yield from filter(None, parser.train(train, dev=dev, test=test is not None, iterations=args.iterations))
  File "tupa/tupa/parse.py", line 457, in train
    self.model.load()
  File "tupa/tupa/model.py", line 244, in load
    raise IOError("Failed loading model from '%s'" % self.filename) from e
OSError: Failed loading model from 'test_files/models/ucca'

Looking at the model .data file, I couldn't find any problem. There is certainly a line starting with #LookupParameter# /_1 there.

However, looking at the training log, I found it that all updates resulted in this error:

Error in update(): Magnitude of gradient is bad: -nan

I could then reproduce the problem by adding (loss/0).backward() before self.trainer.update() in

self.trainer.update()

So now the question is just why all updates result in -nan gradients in the normal code.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant