You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
The number of n_splits used in the crossfit impacts the coverage of observations inspected for calculating SHAP values. With low coverage the number of rows in the consolidated SHAP matrix is less than the number of observations.
Describe the solution you'd like
The ideal solution has a few elements:
A warning should appear for a low number of splits along with a message indicating the coverage of observations for SHAP value calculation.
The inspector should produce all the inputs required for utilising existing shap plotting functions. The inspector should automatically create a sample that contains only the observations that have been explained, so it is aligned with the SHAP outputs.
Describe alternatives you've considered
None - the above solution is the minimum requirement.
Additional context
As an example using 500 simulated data points we can see that in the extreme case of using n_splits = 1, we find the SHAP analysis covers 40% of observations:
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
The number of
n_splits
used in the crossfit impacts the coverage of observations inspected for calculating SHAP values. With low coverage the number of rows in the consolidated SHAP matrix is less than the number of observations.Describe the solution you'd like
The ideal solution has a few elements:
Describe alternatives you've considered
None - the above solution is the minimum requirement.
Additional context
As an example using 500 simulated data points we can see that in the extreme case of using
n_splits = 1
, we find the SHAP analysis covers 40% of observations:The text was updated successfully, but these errors were encountered: