Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【MUSICBERT】 ModuleNotFoundError #183

Open
deathrc opened this issue Dec 27, 2023 · 0 comments
Open

【MUSICBERT】 ModuleNotFoundError #183

deathrc opened this issue Dec 27, 2023 · 0 comments

Comments

@deathrc
Copy link

deathrc commented Dec 27, 2023

I was trying to reproduce the pre-training code in musicbert.

cuda info:NVIDIA-SMI 515.105.01 Driver Version: 515.105.01 CUDA Version: 11.7
I used all the specific package versions given in the requirment.txt


2023-12-22 11:46:44 | INFO | fairseq.trainer | loaded checkpoint checkpoints/checkpoint_last_musicbert_small.pt (epoch 2 @ 14 updates)
2023-12-22 11:46:44 | INFO | fairseq.trainer | loading train data for epoch 2
2023-12-22 11:46:44 | INFO | fairseq.data.data_utils | loaded 3785 examples from: sub_data_bin/train
2023-12-22 11:46:44 | INFO | fairseq.tasks.masked_lm | loaded 3480 blocks from: sub_data_bin/train
2023-12-22 11:46:44 | INFO | fairseq.trainer | begin training epoch 2

Traceback (most recent call last):
File "", line 1, in
File "/home/fyf/.pyenv/versions/anaconda3-5.2.0/envs/musicbert/lib/python3.6/multiprocessing/spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "/home/fyf/.pyenv/versions/anaconda3-5.2.0/envs/musicbert/lib/python3.6/multiprocessing/spawn.py", line 119, in _main
self = reduction.pickle.load(from_parent)
ModuleNotFoundError: No module named 'fairseq_user_dir_48782'
Traceback (most recent call last):
File "", line 1, in
File "/home/fyf/.pyenv/versions/anaconda3-5.2.0/envs/musicbert/lib/python3.6/multiprocessing/spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "/home/fyf/.pyenv/versions/anaconda3-5.2.0/envs/musicbert/lib/python3.6/multiprocessing/spawn.py", line 119, in _main
self = reduction.pickle.load(from_parent)
ModuleNotFoundError: No module named 'fairseq_user_dir_79710'

The pre-training code worked well with single gpu, but when using distributed setting, the fairseq dataloader seems to have some problem, do you have any idea?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant