Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! #49

Open
yuheyuan opened this issue Sep 20, 2022 · 1 comment

Comments

@yuheyuan
Copy link

yuheyuan commented Sep 20, 2022

when I run tran.py

 python train.py --name gta2citylabv2_stage1Denoise --used_save_pseudo --ema --proto_rectify --moving_prototype --path_soft Pseudo/gta2citylabv2_warmup_soft --resume_path ./pretrained/gta2citylabv2_warmup/from_gta5_to_cityscapes_on_deeplabv2_best_model.pkl --proto_consistW 10 --rce --regular_w 0.1
Traceback (most recent call last):
  File "train2.py", line 217, in <module>
    train(opt, logger)
  File "train2.py", line 88, in train
    target_lpsoft, target_image_full, target_weak_params)
  File "/media/ailab/data/yy/ProDA/models/adaptation_modelv2.py", line 209, in step
    weights = self.get_prototype_weight(ema_out['feat'], target_weak_params=target_weak_params)
  File "/media/ailab/data/yy/ProDA/models/adaptation_modelv2.py", line 350, in get_prototype_weight
    feat_proto_distance = self.feat_prototype_distance(feat)
  File "/media/ailab/data/yy/ProDA/models/adaptation_modelv2.py", line 345, in feat_prototype_distance
    feat_proto_distance[:, i, :, :] = torch.norm(self.objective_vectors[i].reshape(-1,1,1).expand(-1, H, W) - feat, 2, dim=1,)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

I only want to use two gpu, So I change the code only in here

 if opt.model_name == 'deeplabv2':
        model = adaptation_modelv2.CustomModel(opt, logger)
        model = torch.nn.DataParallel(model, device_ids=device_ids)
@yuheyuan
Copy link
Author

    def feat_prototype_distance(self, feat):
        print("enterehre!")
        N, C, H, W = feat.shape
        feat_proto_distance = -torch.ones((N, self.class_numbers, H, W)).to(feat.device)
        for i in range(self.class_numbers):
            #feat_proto_distance[:, i, :, :] = torch.norm(torch.Tensor(self.objective_vectors[i]).reshape(-1,1,1).expand(-1, H, W).to(feat.device) - feat, 2, dim=1,)
            print(feat)
            print(self.objective_vectors[i].to(feat.device))
            # print(self.objective_vectors[i])
            feat_proto_distance[:, i, :, :] = torch.norm(self.objective_vectors[i].reshape(-1,1,1).expand(-1, H, W) - feat, 2, dim=1,)  # here feat.toGPU, but self.objective_vecotrs belong to cpu
        
        return feat_proto_distance

I find it may be occur torch.norm here,one belong to GPU, one belong to CPU

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant