Question: weird valid loss when re-scaling y #1013
-
First of all, I must say this project has been a fundamental part of my master's thesis, so thank you very much for that. In the last couple of days, I've been trying to understand and fix an issue but had no success. I've managed to reduce my code to a minimal example, so I hope it's easy to understand: import numpy as np
import pandas as pd
import torch
import torch.nn as nn
from skorch.regressor import NeuralNetRegressor
from torch.optim import Adam
from torchmetrics.regression import MeanAbsolutePercentageError
class Module(nn.Module):
def __init__(self, input_dimensions, dropout_rate=0):
super(Module, self).__init__()
self.module = nn.Sequential(
nn.Linear(input_dimensions, 256),
nn.ReLU(),
nn.Dropout(p=dropout_rate),
nn.Linear(256, 128),
nn.ReLU(),
nn.Dropout(p=dropout_rate),
nn.Linear(128, 64),
nn.ReLU(),
nn.Dropout(p=dropout_rate),
nn.Linear(64, 1),
)
def forward(self, X):
if X.dtype != torch.float32:
X = X.to(torch.float32)
X = self.module(X)
return X
class NeuralNet(NeuralNetRegressor):
def fit(self, X, y):
# Sets the input dimensions of the module according to current X
self.set_params(module__input_dimensions=X.shape[1])
# Check if X is a Pandas DataFrame and convert it to a numpy array
if hasattr(X, "to_numpy"):
X = X.to_numpy()
# Check if y is a Pandas Series and convert it to a numpy array
if hasattr(y, "to_numpy"):
y = y.to_numpy()
if X.dtype != np.float32:
X = X.astype(np.float32)
if y.dtype != np.float32:
y = y.astype(np.float32)
# Reshape y to 2D if it is 1D
# From https://github.com/skorch-dev/skorch/issues/701#issuecomment-700943377
if y.ndim == 1:
y = y.reshape(-1, 1)
return super().fit(X, y)
df = pd.read_json("08-final-dataset.json.gz")
regressor = NeuralNet(
module=Module,
criterion=MeanAbsolutePercentageError,
optimizer=Adam,
lr=0.001, # learning rate
max_epochs=20,
verbose=1,
)
X = df[["area", "rooms", "bathrooms"]]
y = df["price"]
# Reescale y to [0, 1]
y = (y - y.min(axis=0)) / (y.max(axis=0) - y.min(axis=0))
regressor.fit(X, y) The output below is the result of running the above code commenting the
However, when I uncomment that line, train loss is still okay, but the valid loss range is way higher:
I've tried with PyTorch Forecasting MAPE loss with similar results. As a consequence, I can't use an early stopper, for example, because valid loss is simply untrustable So my question is: do you have any idea of any internal process that could be causing this? I've tried to look at the code, but I'm not able to find anything. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Happy to hear that, thanks. In converted the issue into a discussion, I hope you don't mind. Regarding your problem, I could reproduce it with a synthetic dataset. My first thought was that by scaling Interestingly, when I used the default loss (MSE), there was no such weird behavior. One reason could be the normalization step in MAPE, which divides by In general, I don't think that MAPE is a good criterion, I would prefer something more stable like MSE. You can still calculate the loss as an additional score (using the |
Beta Was this translation helpful? Give feedback.
Happy to hear that, thanks.
In converted the issue into a discussion, I hope you don't mind.
Regarding your problem, I could reproduce it with a synthetic dataset. My first thought was that by scaling
y
, we change the order of magnitude of the loss and thus need to adjust the learning rate to prevent overfitting. But after experimenting a bit with this, I don't believe anymore that this is the problem (or not the whole problem).Interestingly, when I used the default loss (MSE), there was no such weird behavior. One reason could be the normalization step in MAPE, whi…