[performance] slogdet is slow on GPU #5

Nintorac · 2020-09-09T04:58:36Z

Hey, great codebase, thank you!

I was looking into performance bottlenecks and I found the following which gave me almost a 2x (1.76 it/s -> 2.96 it/s) speedup for the cifar10 example

The issue is in the Conv1x1 module. the calculation of torch.slogdet is much slower on GPU than CPU

https://github.com/didriknielsen/survae_flows/blob/master/survae/transforms/bijections/conv1x1.py#L40

This is the modified fast _logdet

    def _logdet(self, x_shape):
        b, c, h, w = x_shape
        _, ldj_per_pixel = torch.slogdet(self.weight.to('cpu'))
        ldj = ldj_per_pixel * h * w
        return ldj.expand([b]).to(self.weight.device)

The text was updated successfully, but these errors were encountered:

hmdolatabadi · 2020-09-10T02:32:59Z

Hi,

Related to this issue, if you try a large network (e.g. the Glow architecture for CIFAR-10), then you may encounter an error in the middle of training which says:

File "./examples/cifar10_aug_flow.py", line 102, in <module>
    loss.backward()
  File "/home/user/.conda/envs/idf/lib/python3.7/site-packages/torch/tensor.py", line 185, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/user/.conda/envs/idf/lib/python3.7/site-packages/torch/autograd/__init__.py", line 127, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: svd_cuda: the updating process of SBDSDC did not converge (error: 23)

After looking it up on Google, it seems to me that the SVD operation of _slogdet may be responsible for this. On a note in PyTorch official documentation they say:

Backward through slogdet() internally uses SVD results when input is not invertible. In this case, double backward through slogdet() will be unstable in when input doesn’t have distinct singular values. See svd() for details.

I haven't tested the above solution to see whether it has an effect or not.

UPDATE:
After trying the above solution, the same problem happened to me on epoch 40 when I was training a model:

File "./examples/cifar10_aug_flow.py", line 102, in <module>
    loss.backward()
  File "/home/user/.conda/envs/idf/lib/python3.7/site-packages/torch/tensor.py", line 185, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/user/.conda/envs/idf/lib/python3.7/site-packages/torch/autograd/__init__.py", line 127, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: svd_cpu: the updating process of SBDSDC did not converge (error: 23)

didriknielsen · 2020-09-10T08:19:11Z

The issue is in the Conv1x1 module. the calculation of torch.slogdet is much slower on GPU than CPU

Hi,

Thanks! This is gold. I tried on my computer I also found a ~20% speedup by running torch.slogdet on CPU.
I've added an argument slogdet_cpu in Conv1x1 with default=True.

didriknielsen · 2020-09-10T08:36:15Z

Related to this issue, if you try a large network (e.g. the Glow architecture for CIFAR-10), then you may encounter an error in the middle of training which says: [...]

The CIFAR-10 example uses the default scale_fn=lambda s: torch.exp(s) in the AffineCouplingBijection.
This choice can lead to instability during longer training since the scales output by the coupling networks can become very large.

I would suggest using something like

scale_fn=lambda s: torch.exp(2. * torch.tanh(s / 2.)) or
scale_fn=lambda s: torch.sigmoid(s+2.)+1e-3

instead, which keep the scales bounded.

The first choice is what we used in our image experiments, the second is what was used in the Glow code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[performance] slogdet is slow on GPU #5

[performance] slogdet is slow on GPU #5

Nintorac commented Sep 9, 2020 •

edited

Loading

hmdolatabadi commented Sep 10, 2020 •

edited

Loading

didriknielsen commented Sep 10, 2020

didriknielsen commented Sep 10, 2020 •

edited

Loading

[performance] slogdet is slow on GPU #5

[performance] slogdet is slow on GPU #5

Comments

Nintorac commented Sep 9, 2020 • edited Loading

hmdolatabadi commented Sep 10, 2020 • edited Loading

didriknielsen commented Sep 10, 2020

didriknielsen commented Sep 10, 2020 • edited Loading

Nintorac commented Sep 9, 2020 •

edited

Loading

hmdolatabadi commented Sep 10, 2020 •

edited

Loading

didriknielsen commented Sep 10, 2020 •

edited

Loading