Fix problem with gradients accumulating #3

AlekseySh · 2019-11-11T19:56:00Z

The problem arises because a gradient accumulates in the layers with each new transformation.
Let's imagine, that original model works well with batch_size 128 (for example) and GPU was fully loaded, then wrapped model with several transform will crash for similar batch_size = 128, it will work only for batch size ~= 40.
To fix this problem I added torch._no_grad().

PS. If you now about this problem and assume that this code must be added in outer function, then you have to change your example snippet from readme because it doesn't include no_grad(). But in my opinion, it will be better left this code here, inside the forward.

fix problem with gradients accumulating

qubvel · 2019-11-14T22:42:06Z

Hi, thanks for proposal.
In my opinion, despite it is called "test" time augmentation wrapper, it is still possible to use it for training. So, it is better to add a note about no_grad().

Update wrappers.py

d3f5892

fix problem with gradients accumulating

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix problem with gradients accumulating #3

Fix problem with gradients accumulating #3

AlekseySh commented Nov 11, 2019

qubvel commented Nov 14, 2019

Fix problem with gradients accumulating #3

Are you sure you want to change the base?

Fix problem with gradients accumulating #3

Conversation

AlekseySh commented Nov 11, 2019

qubvel commented Nov 14, 2019