You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I get a shaping error when trying to make predictions on new and unseen data without the target feature variable that the model(s) were trained on, so I use placeholder values for the target variable as substitute for the missing data. however, when I use placeholders like np.zeros, previous values, averages etc. my prediction error goes from <1% to at least over 8% :(
# Old data
X_train, X_test, y_train, y_test = train_test_split(X, y)
def tell_me_about_the_Data(**kwargs):
.....
tell_me_about_the_data(X_train, X_test, y_train, y_test)
# these datasets are np.arrays of shape (1999, 42). here we will be training with 42 features
# import, train, fine tune and fit the model(s)
........
# New data
tell_me_about_the_data(new_data)
# this dataset is an np.array of shape (30, 41). that is only 41 features while your best models were trained and fit on 42 features. you will have a shaping error if you try to make one forward pass prediction on a dataset with an unknown or missing target variable
make_predictions = model.predict(new_data)
ValueError or whatever error corresponds to shaping error: shape (x, 41) but the model expected shape (x, 42)
# Using placeholders for the target_variable feature to fix the shape error creates poor predictions and reduces accuracy by >= 8%
new_data['target_variables'] = np.zeros # or average of old_data or some other filler
make_predictions = model.predict(new_data)
# MSE = 25%
last_known_target_features = old_data['target_variable'].tail(30)
new_data['target_variable'] = last_known_target_features
make_predictions = model.predict(new_data)
# MSE = 8%
# Original models MSE for generalized testing on the held out y_test set is < 1%. I want close to the <1% error I originally trained and tested on
The text was updated successfully, but these errors were encountered:
Hi, I get a shaping error when trying to make predictions on new and unseen data without the target feature variable that the model(s) were trained on, so I use placeholder values for the target variable as substitute for the missing data. however, when I use placeholders like np.zeros, previous values, averages etc. my prediction error goes from <1% to at least over 8% :(
The text was updated successfully, but these errors were encountered: