Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA error: invalid device ordinal (only 1 GPU in my system, how to resolve) #55

Open
Jayku88 opened this issue Sep 12, 2023 · 0 comments

Comments

@Jayku88
Copy link

Jayku88 commented Sep 12, 2023

[09/12 09:35:46 main-logger]: use SyncBN
/home/vrlabhlbs/anaconda3/envs/spheretest/lib/python3.7/multiprocessing/semaphore_tracker.py:144: UserWarning: semaphore_tracker: There appear to be 3 leaked semaphores to clean up at shutdown
len(cache))
Traceback (most recent call last):
File "train.py", line 902, in
main()
File "train.py", line 90, in main
mp.spawn(main_worker, nprocs=args.ngpus_per_node, args=(args.ngpus_per_node, args))
File "/home/vrlabhlbs/anaconda3/envs/spheretest/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/vrlabhlbs/anaconda3/envs/spheretest/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
while not context.join():
File "/home/vrlabhlbs/anaconda3/envs/spheretest/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 150, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 1 terminated with the following error:
Traceback (most recent call last):
File "/home/vrlabhlbs/anaconda3/envs/spheretest/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
fn(i, *args)
File "/home/vrlabhlbs/SphereFormer/train.py", line 156, in main_worker
torch.cuda.set_device(gpu)
File "/home/vrlabhlbs/anaconda3/envs/spheretest/lib/python3.7/site-packages/torch/cuda/init.py", line 261, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: CUDA error: invalid device ordinal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant