Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to resolve “RuntimeError: CUDA error: device-side assert triggered”? #27

Open
JenniferYingyiWu2020 opened this issue Sep 15, 2021 · 0 comments

Comments

@JenniferYingyiWu2020
Copy link

JenniferYingyiWu2020 commented Sep 15, 2021

Hi qiaoguan,
The GitHub project called “Person-reid-GAN-pytorch” is very interested to me. I followed the steps on README.md file, also I have downloaded the dataset “Market-1501”. However, when I execute the command “python train_baseline.py –use_dense”, and I modified the codes due to only one GPU is owned by me,
100
101
102

the following errors have appeared:

"12936
751
/home/jenniferwu/Documents/Python_project/Person-reid-GAN-pytorch-master/model.py:14: UserWarning: nn.init.kaiming_normal is now deprecated in favor of nn.init.kaiming_normal_.
init.kaiming_normal(m.weight.data, a=0, mode='fan_out')
/home/jenniferwu/Documents/Python_project/Person-reid-GAN-pytorch-master/model.py:15: UserWarning: nn.init.constant is now deprecated in favor of nn.init.constant_.
init.constant(m.bias.data, 0.0)
/home/jenniferwu/Documents/Python_project/Person-reid-GAN-pytorch-master/model.py:17: UserWarning: nn.init.normal is now deprecated in favor of nn.init.normal_.
init.normal(m.weight.data, 1.0, 0.02)
/home/jenniferwu/Documents/Python_project/Person-reid-GAN-pytorch-master/model.py:18: UserWarning: nn.init.constant is now deprecated in favor of nn.init.constant_.
init.constant(m.bias.data, 0.0)
/home/jenniferwu/Documents/Python_project/Person-reid-GAN-pytorch-master/model.py:23: UserWarning: nn.init.normal is now deprecated in favor of nn.init.normal_.
init.normal(m.weight.data, std=0.001)
/home/jenniferwu/Documents/Python_project/Person-reid-GAN-pytorch-master/model.py:24: UserWarning: nn.init.constant is now deprecated in favor of nn.init.constant_.
init.constant(m.bias.data, 0.0)
Epoch 0/12
/root/anaconda3/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:136: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
"https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
train_baseline.py:165: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
flos=F.log_softmax(input) # NK? batchsize751
train_baseline.py:167: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
logpt=F.log_softmax(input) # size: batchsize*751

Traceback (most recent call last):
File "train_baseline.py", line 349, in
num_epochs=13)
File "train_baseline.py", line 251, in train_model
loss.backward()
File "/root/anaconda3/lib/python3.7/site-packages/torch/tensor.py", line 221, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/root/anaconda3/lib/python3.7/site-packages/torch/autograd/init.py", line 132, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: CUDA error: device-side assert triggered"

So, would you pls help to give me some suggestions on how to resolve “RuntimeError: CUDA error: device-side assert triggered”? Many Thanks!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant