You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
initial num of features 581 but feature importance of the final pipeline has 587 features.
It looks like that at each of the 3 steps of the pipeline, the # of features increased from 581 -> 584 -> 587
Is there a way to map the 578 features at the end of the pipeline back to the original 581 features?
from sklearn.naive_bayes import GaussianNB
from sklearn.pipeline import make_pipeline, make_union
from tpot.builtins import StackingEstimator
from xgboost import XGBClassifier
effectively, what it does is takes the predictions of the model and appends it to the left of the inputted data X. If its a classifier with predict_proba, the all class probabilities are also included. If you have a binary class, that means that there would be two additional columns, one for each class.
so in your case trans_x_t is [model 1 predicted labels, model 1 probability for class 0, model 1 probability for class 1, ]
similarly
trans_x_t1 would be [model 2 predicted labels, model 2 probability for class 0, model 2 probability for class 1, <trans_x_t>]
initial num of features 581 but feature importance of the final pipeline has 587 features.
It looks like that at each of the 3 steps of the pipeline, the # of features increased from 581 -> 584 -> 587
Is there a way to map the 578 features at the end of the pipeline back to the original 581 features?
from sklearn.naive_bayes import GaussianNB
from sklearn.pipeline import make_pipeline, make_union
from tpot.builtins import StackingEstimator
from xgboost import XGBClassifier
exported_pipeline = make_pipeline(
StackingEstimator(estimator=XGBClassifier(learning_rate=0.01, max_depth=4, min_child_weight=6, n_estimators=100, n_jobs=1, subsample=0.15000000000000002, verbosity=0)),
StackingEstimator(estimator=GaussianNB()),
XGBClassifier(learning_rate=0.5, max_depth=2, min_child_weight=20, n_estimators=100, n_jobs=1, subsample=0.9000000000000001, verbosity=0)
)
exported_pipeline.fit(x_v, y_v)
trans_x_t = exported_pipeline[0].transform(x_t)
trans_x_t1 = exported_pipeline[1].transform(trans_x_t)
print(x_t.shape)
(677279, 581)
print(trans_x_t.shape)
(677279, 584)
print(trans_x_t1.shape)
(677279, 587)
exported_pipeline[-1].feature_importances_.shape
(587,)
The text was updated successfully, but these errors were encountered: