Skip to content

fit vs transform

Tansu Dasli edited this page Sep 22, 2023 · 4 revisions

there are 3 critical distinction for this concepts

  1. transformer (preprocessing) vs estimator (model) phase
  2. pipeline (fit, predict) vs normal usage (fit, transform and fit_transform)
  3. train vs test data

So, with test data, there is no fitting at transformer step! no new calculation! (better handling the overfitting!)

  • if, pipeline.fit(X_train,...) used,
    • @preprocessing step, fit & transform applied,
    • @estimator step, only fit applied!
  • In pipeline.predict(X_test,...), transform and predict steps are applied
                                transformer  --------------------  model
                              (preprocessing)                   (estimator)     
  train                  |     fit (calculates params)     |    fit (trains the model)
                               transform 
  test (predict phase)   |     transform                   |    predict

go w/ pipelines !

p.fit(X_train, y_train)
p.predict(X_test)
p.score(X_test, y_test)