The effect of using tsai bultin TSStandard Scaler and Sklearn Standard Scaler on multivariate Time Series Forecasting using PatchTST. #885

MALIK2k21 · 2024-02-26T07:58:05Z

MALIK2k21
Feb 26, 2024

Hi!

Hope everybody is well.

I am trying to optimize patchTST model for multivariate forecasting on dataset with ~27000 samples and 8 columns. The splits are train=70,Valid=20 and Test 10. I have tried implementing tutorial notebook on patctTST with my data and used the same scaling method as in notebook. The results are no matter what I do there is almost always a difference between training and validation loss(overfitting). So, I tried using sklearn standrd scaler what happened was that the overfittingw as reduced considerably;however, the overall MAE loss stands at ~4.019 while validation loss stands at ~4.703. Comparaively speaking, using TSSTandard Scaler leads to mae loss of ~0.57 and ~0.67 for training and validation respectively. To avoid this gap, I tried using dropout, decreasing or increasing complexity of model architecture etc. but no luck. So, my question is how can I deal with this?

Note: After getting the splits, I fitted the sklean standard scaler on train data only and then trasnformed both valdiationa and test datasets. Also, optimal learning rate was found using lr_find.

Model Architecture:

PatchTST (Input shape: 256 x 8 x 192) 
============================================================================ 
Layer (type)         Output Shape         Param #    Trainable  
============================================================================ 
                     256 x 8 x 2          
RevIN                                     16         True       
____________________________________________________________________________ 
                     256 x 8 x 196        
ReplicationPad1d                                                
____________________________________________________________________________ 
                     256 x 8 x 48         
Unfold                                                          
____________________________________________________________________________ 
                     256 x 8 x 48 x 256   
Linear                                    2304       True       
Dropout                                                         
Linear                                    65792      True       
Linear                                    65792      True       
Linear                                    65792      True       
Dropout                                                         
Linear                                    65792      True       
Dropout                                                         
Dropout                                                         
____________________________________________________________________________ 
                     256 x 256 x 48       
Transpose                                                       
BatchNorm1d                               512        True       
____________________________________________________________________________ 
                     256 x 48 x 256       
Transpose                                                       
____________________________________________________________________________ 
                     256 x 48 x 132       
Linear                                    33924      True       
GELU                                                            
Dropout                                                         
____________________________________________________________________________ 
                     256 x 48 x 256       
Linear                                    34048      True       
Dropout                                                         
____________________________________________________________________________ 
                     256 x 256 x 48       
Transpose                                                       
BatchNorm1d                               512        True       
____________________________________________________________________________ 
                     256 x 48 x 256       
Transpose                                                       
Linear                                    65792      True       
Linear                                    65792      True       
Linear                                    65792      True       
Dropout                                                         
Linear                                    65792      True       
Dropout                                                         
Dropout                                                         
____________________________________________________________________________ 
                     256 x 256 x 48       
Transpose                                                       
BatchNorm1d                               512        True       
____________________________________________________________________________ 
                     256 x 48 x 256       
Transpose                                                       
____________________________________________________________________________ 
                     256 x 48 x 132       
Linear                                    33924      True       
GELU                                                            
Dropout                                                         
____________________________________________________________________________ 
                     256 x 48 x 256       
Linear                                    34048      True       
Dropout                                                         
____________________________________________________________________________ 
                     256 x 256 x 48       
Transpose                                                       
BatchNorm1d                               512        True       
____________________________________________________________________________ 
                     256 x 48 x 256       
Transpose                                                       
____________________________________________________________________________ 
                     256 x 8 x 12288      
Flatten                                                         
____________________________________________________________________________ 
                     256 x 8 x 2          
Linear                                    24578      True       
____________________________________________________________________________ 
 
Total params: 691,226 
Total trainable params: 691,226 
Total non-trainable params: 0 
 
Optimizer used: <function Adam at 0x7fce117e55a0> 
Loss function: <function mae at 0x7fcdeccf8c10> 
 
Callbacks: 
  - TrainEvalCallback 
  - CastToTensor 
  - Recorder 
  - ProgressCallback 
  - ShowGraph

Loss Curves using TSStandard Scaler:

Loss Curves using Skelearn Standard Scaler:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The effect of using tsai bultin TSStandard Scaler and Sklearn Standard Scaler on multivariate Time Series Forecasting using PatchTST. #885

{{title}}

Replies: 0 comments

Select a reply

The effect of using tsai bultin TSStandard Scaler and Sklearn Standard Scaler on multivariate Time Series Forecasting using PatchTST. #885

MALIK2k21 Feb 26, 2024

Replies: 0 comments

MALIK2k21
Feb 26, 2024