Allow passing in cr/cl bounds and other settings #6

winston-zillow · 2021-12-07T00:13:50Z

Fix execution on CPU and GPU. Fix model loading.

Allow CPU execution. Fix GPU support. Fix module loading.

winston-zillow · 2021-12-07T00:14:57Z

README.md

@@ -22,7 +22,7 @@ We need to put the data sets in the `dataset` folder. You can specify one data s

 ```bash
 # trained on the tic-tac-toe data set with one GPU.
-python3 experiment.py -d tic-tac-toe -bs 32 -s 1@16 -e401 -lrde 200 -lr 0.002 -ki 0 -mp 12481 -i 0 -wd 1e-6 &
+python3 experiment.py -d tic-tac-toe -bs 32 -s 1@16 -e401 -lrde 200 -lr 0.002 -ki 0 -mp 12481 -i cuda:0 -wd 1e-6 &


Note: see review comment on args.py changes

winston-zillow · 2021-12-07T00:19:39Z

args.py

@@ -51,7 +52,8 @@
 rrl_args.plot_file = os.path.join(rrl_args.folder_path, 'plot_file.pdf')
 rrl_args.log = os.path.join(rrl_args.folder_path, 'log.txt')
 rrl_args.test_res = os.path.join(rrl_args.folder_path, 'test_res.txt')
-rrl_args.device_ids = list(map(int, rrl_args.device_ids.strip().split('@')))
+rrl_args.device_ids = list(map(lambda id: torch.device(id), rrl_args.device_ids.strip().split('@'))) \
+    if rrl_args.device_ids else [None]


Note: I found that passing in integer device ID would get the tensors pegged to the GPU memory but the GPU compute utilization remains at 0, as shown by nvidia-smi. After I change the device ID to that returned by torch.device("cuda:0"), the GPU is utilized fully. I do not know why that's the case as simple test using a python loop can cause GPU utilization.

Example run passing in integer device ID:

+-----------------------------------------------------------------------------+ | NVIDIA-SMI 450.142.00 Driver Version: 450.142.00 CUDA Version: 11.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 Tesla K80 On | 00000000:00:1E.0 Off | 0 | | N/A 47C P0 70W / 149W | 322MiB / 11441MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 27173 C ...vs/pytorch_p37/bin/python 319MiB | +-----------------------------------------------------------------------------+

Example run passing in cuda:*:

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 27346 C ...vs/pytorch_p37/bin/python 1736MiB | +-----------------------------------------------------------------------------+ Sat Dec 4 01:31:31 2021 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 450.142.00 Driver Version: 450.142.00 CUDA Version: 11.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 Tesla K80 On | 00000000:00:1E.0 Off | 0 | | N/A 52C P0 138W / 149W | 1739MiB / 11441MiB | 100% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 27346 C ...vs/pytorch_p37/bin/python 1736MiB | +-----------------------------------------------------------------------------+

winston-zillow · 2021-12-07T00:21:23Z

experiment.py

+            # lower_bound: [continuous cols]
+            # upper_bound: [continuous cols]
+        }
+    return settings


Note: I added this new setting file so that the user can pass in CR/CL bounds as well as controlling normalization and one-hot encoding etc. (those are currently hard-coded)

winston-zillow · 2021-12-07T00:22:20Z

rrl/components.py

-            if self.left is not None and self.right is not None:
+            if cl is not None and cr is not None:  # bounds are specified
+                cl = torch.tensor(cl).type(torch.float).t()
+                cr = torch.tensor(cr).type(torch.float).t()


Note: here we can pass in the cl/cr bounds directly.

winston-zillow · 2021-12-07T00:22:38Z

rrl/components.py

                cl = self.left + torch.rand(self.n, self.input_dim[1]) * (self.right - self.left)
                cr = self.left + torch.rand(self.n, self.input_dim[1]) * (self.right - self.left)
            else:
                cl = 3. * (2. * torch.rand(self.n, self.input_dim[1]) - 1.)
                cr = 3. * (2. * torch.rand(self.n, self.input_dim[1]) - 1.)
+            assert torch.Size([self.n, self.input_dim[1]]) == cl.size()
+            assert torch.Size([self.n, self.input_dim[1]]) == cr.size()


Note: and verify the shapes are correct.

winston-zillow · 2021-12-07T00:24:08Z

rrl/models.py


-        self.net.cuda(self.device_id)
+        if self.device_id and self.device_id.type == 'cuda':


Note: the condition allows the program to run in CPU mode as well.

winston-zillow · 2021-12-07T00:26:03Z

rrl/utils.py

-        self.feature_enc = preprocessing.OneHotEncoder(categories='auto', drop=drop)
-        self.imp = SimpleImputer(missing_values=np.nan, strategy='mean')
+        self.feature_enc = preprocessing.OneHotEncoder(categories='auto', drop=drop) if one_hot_encode_features else None
+        self.imp = SimpleImputer(missing_values=np.nan, strategy='mean') if impute_continuous else None


Note: for dataset not requiring or already have one-hot encoding or imputation, they can now be skipped.

12wang3 · 2021-12-07T04:00:03Z

Thank you very much for the PR. I am busy on other stuff now and will check the code after Dec 9.

ASan1527 · 2022-11-08T06:30:45Z

I cant catch the device_ids, and I only have the single gpu, I don't know how to change the code. Could you please tell me to solve it? thank you!

12wang3 · 2022-11-08T07:01:51Z

I cant catch the device_ids, and I only have the single gpu, I don't know how to change the code. Could you please tell me to solve it? thank you!

Could you please show the command you used? Have you set the "-i" argument? It seems that you did not set the device_ids since your device_ids was None. If you only have one single GPU, you can use "-i 0" to set the device_ids. By the way, maybe we should use issue rather than PR to discuss questions.

winston-zillow added 3 commits December 6, 2021 15:49

Allow passing in cr/cl bounds and settings.

d590e4b

Allow CPU execution. Fix GPU support. Fix module loading.

Merge remote-tracking branch 'origin/main'

035bc44

Update README

9ddf68a

winston-zillow commented Dec 7, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow passing in cr/cl bounds and other settings #6

Allow passing in cr/cl bounds and other settings #6

winston-zillow commented Dec 7, 2021

winston-zillow Dec 7, 2021

winston-zillow Dec 7, 2021

winston-zillow Dec 7, 2021

winston-zillow Dec 7, 2021

winston-zillow Dec 7, 2021

winston-zillow Dec 7, 2021

winston-zillow Dec 7, 2021

12wang3 commented Dec 7, 2021

ASan1527 commented Nov 8, 2022

12wang3 commented Nov 8, 2022


		self.net.cuda(self.device_id)
		if self.device_id and self.device_id.type == 'cuda':

Allow passing in cr/cl bounds and other settings #6

Are you sure you want to change the base?

Allow passing in cr/cl bounds and other settings #6

Conversation

winston-zillow commented Dec 7, 2021

winston-zillow Dec 7, 2021

Choose a reason for hiding this comment

winston-zillow Dec 7, 2021

Choose a reason for hiding this comment

winston-zillow Dec 7, 2021

Choose a reason for hiding this comment

winston-zillow Dec 7, 2021

Choose a reason for hiding this comment

winston-zillow Dec 7, 2021

Choose a reason for hiding this comment

winston-zillow Dec 7, 2021

Choose a reason for hiding this comment

winston-zillow Dec 7, 2021

Choose a reason for hiding this comment

12wang3 commented Dec 7, 2021

ASan1527 commented Nov 8, 2022

12wang3 commented Nov 8, 2022