Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All MAD predictions should be positive. #50

Closed
JulianHidalgo opened this issue Mar 4, 2024 · 5 comments · Fixed by #53
Closed

All MAD predictions should be positive. #50

JulianHidalgo opened this issue Mar 4, 2024 · 5 comments · Fixed by #53
Assignees
Labels
bug Something isn't working

Comments

@JulianHidalgo
Copy link

JulianHidalgo commented Mar 4, 2024

Hi!

Thank you for creating Puncc. I'm trying to use LocallyAdaptiveCP as described here https://deel-ai.github.io/puncc/regression.html#deel.puncc.regression.LocallyAdaptiveCP

                mu_model = xgb.XGBRegressor()
                sigma_model = xgb.XGBRegressor()
                # Wrap models in a mean/variance predictor
                mean_var_predictor = MeanVarPredictor(
                    models=[mu_model, sigma_model]
                )
                cp = LocallyAdaptiveCP(mean_var_predictor)
                cp.fit(X_fit=X_train, y_fit=y_train, X_calib=X_test, y_calib=y_test)

But I get an error: All MAD predictions should be positive. Any idea of what am I missing?
I think the error comes from

raise RuntimeError("All MAD predictions should be positive.")

mean_absolute_deviation = absolute_difference(y_pred, y_true)
if np.any(sigma_pred < 0):
    raise RuntimeError("All MAD predictions should be positive.")
return mean_absolute_deviation / (sigma_pred + EPSILON)

But I don't know how to avoid it. Any pointers would be greatly appreciated!

@M-Mouhcine
Copy link
Collaborator

M-Mouhcine commented Mar 5, 2024

Hi @JulianHidalgo !

Thanks for opening this issue. I could indeed reproduce the error when using xgboost models with LocallyAdaptiveCP.

Actually, sigma_model is trained to predict the absolute residual $|y-\mu(X)|$, such that $\mu$ is the trained mu_model and $X$ and $y$ are respectively a feature and associated target. The output of sigma_model should be positive, otherwise is messes up with the conformal prediction algorithm. However, in your case, some of such values are negative, which is not allowed.

I've noticed that this behavior happens when the number of estimators n_estimators of the xgboost model is high (by default, it is 100). I've tried using lower values, for example 5 or10, and it works fine:

mu_model = xgb.XGBRegressor()
sigma_model = xgb.XGBRegressor(n_estimators=5)
# Wrap models in a mean/variance predictor
mean_var_predictor = MeanVarPredictor(
    models=[mu_model, sigma_model]
)
cp = LocallyAdaptiveCP(mean_var_predictor)
cp.fit(X_fit=X_train, y_fit=y_train, X_calib=X_test, y_calib=y_test)

Can you see if that works for you ?

PS: we will look into a suitable solution to "correct" models that predict negative values. We could simply force the absolute value of sigma_model predictions, but we will explore more options and pick the least problematic.

@M-Mouhcine M-Mouhcine self-assigned this Mar 5, 2024
@M-Mouhcine M-Mouhcine added the bug Something isn't working label Mar 5, 2024
@JulianHidalgo
Copy link
Author

Thanks for checking this out! Reducing the number of estimator helps, but it also decreases the accuracy of the model and it's not reliable: the same number of estimators works fine with a dataset and fails with another. I noticed LightGBM also generates negative values some times, but less often than XGBoost. At least now I know it's not something in the way I'm using the library or my datasets in particular. I will be watching the issue, thank you again!

@jdalch
Copy link
Collaborator

jdalch commented Mar 7, 2024

Hello @JulianHidalgo, thanks again for using PUNCC and for rising this issue! After having discussed it with the team, we have decided to take the following steps to fix this issue:

  1. Increase the value of the threshold EPSILON in the scaled_ad nonconformity score.
  2. Add the threshold EPSILON to the scaled_interval prediction set.
  3. Modify the scaled_ad nonconformity score: compute residuals only for calibration points such that sigma + EPSILON > 0, and return warning that some calibration data is not used.
  4. Modify the scaled_interval prediction set: return an infinite sized prediction set if sigma + ESPILON <= 0, and return a warning.

We hope this fixes your issue for you. We expect that negative values of sigma are rare and that our procedure does not have a big impact in the size of the prediction sets. Of course, the probabilistic guarantees given by conformal prediction will remain true after this modification. #

@JulianHidalgo
Copy link
Author

Hey @jdalch!
Thank you so much to you and the team for designing a solution 😊.

@jdalch jdalch linked a pull request Mar 14, 2024 that will close this issue
@M-Mouhcine
Copy link
Collaborator

Hey @JulianHidalgo,

@jdalch has implemented his solution to address the problem. Could you please test it and let us know if it works?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants