You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I run the code tools/train_net.py with 2 V100 GPUs, I encounter the follow error:
File "/mnt/cap/caijh/app/src/detectron2/detectron2/engine/train_loop.py", line 155, in train
self.run_step()
File "/mnt/workspace/code/ODISE/odise/engine/train_loop.py", line 297, in run_step
grad_norm = self.grad_scaler(
File "/mnt/workspace/code/ODISE/odise/engine/train_loop.py", line 207, in __call__
self._scaler.scale(loss).backward(create_graph=create_graph)
File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/_tensor.py", line 488, in backward
torch.autograd.backward(
File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/__init__.py", line 197, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/function.py", line 267, in apply
return user_fn(self, *args)
File "/mnt/workspace/code/ODISE/third_party/stable-diffusion/ldm/modules/diffusionmodules/util.py", line 138, in backward
output_tensors = ctx.run_function(*shallow_copies)
File "/mnt/workspace/code/ODISE/third_party/stable-diffusion/ldm/modules/attention.py", line 212, in _forward
x = self.attn1(self.norm1(x)) + x
File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/nn/modules/normalization.py", line 190, in forward
return F.layer_norm(
File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/nn/functional.py", line 2515, in layer_norm
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: expected scalar type Half but found Float
Thanks for your great work!
When I run the code
tools/train_net.py
with 2 V100 GPUs, I encounter the follow error:The arguments are
Appreciate any idea to solve this issue, thank you.
The text was updated successfully, but these errors were encountered: