Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gradient computation fails while training 'HRNet_FeatureExtractor' due to an inplace operation #308

Open
mohammadalihumayun opened this issue Aug 6, 2024 · 0 comments

Comments

@mohammadalihumayun
Copy link

mohammadalihumayun commented Aug 6, 2024

Using latest torch version when i try to train HRNet_FeatureExtractor from modules.feature_extraction, i get following error

one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [4, 512, 4, 50]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later.

Following is the model using the feat extractor
Please note that dataset used as input is a list of tuples each containing images as numpy arrays and labels as tex strings
`
class new_Model(nn.Module):
def init(self,input_channel = 3,
output_channel = 32,
FeatureExtraction = 'HRNet',
SequenceModeling = 'DBiLSTM',
Prediction = 'CTC',
batch_max_length=100,
hidden_size=256,
imgH=32,
imgW=400,):
super(new_Model, self).init()
self.stages = {'Feat': FeatureExtraction,
'Seq': SequenceModeling,
'Pred': Prediction}

    self.FeatureExtraction = HRNet_FeatureExtractor(input_channel, output_channel)
    self.FeatureExtraction_output = output_channel
    self.AdaptiveAvgPool = nn.AdaptiveAvgPool2d((None, 1)) # Transform final (imgH/16-1) -> 1
    self.SequenceModeling_output = hidden_size
    self.SequenceModeling = nn.Sequential(
            BidirectionalLSTM(self.FeatureExtraction_output, hidden_size, hidden_size),
            BidirectionalLSTM(hidden_size, hidden_size, hidden_size))
    self.Prediction = nn.Linear(self.SequenceModeling_output, num_class)
def forward(self, input, text=None, is_train=True):
    visual_feature = self.FeatureExtraction(input)
    visual_feature = self.AdaptiveAvgPool(visual_feature.permute(0, 3, 1, 2))  # [b, c, h, w] -> [b, w, c, h]
    visual_feature = visual_feature.squeeze(3)
    contextual_feature = self.SequenceModeling(visual_feature)
    prediction = self.Prediction(contextual_feature.contiguous())
    return prediction

model = new_Model( )
model = model.to(device)
model.train()
However please note that same code run fines when i use another feature extractor e.g. just by replacing
self.FeatureExtraction = HRNet_FeatureExtractor(input_channel, output_channel)
with
self.FeatureExtraction = DenseNet_FeatureExtractor(input_channel, output_channel)`
within the model, the code runs fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant