Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: einsum(): operands do not broadcast with remapped shapes [original->remapped] #73

Open
asdfzlcl opened this issue Dec 7, 2023 · 7 comments

Comments

@asdfzlcl
Copy link

asdfzlcl commented Dec 7, 2023

在使用自己的数据集做实验的过程中,发现代码报错为:
QQ图片20231207193728

我使用的参数是:
export CUDA_VISIBLE_DEVICES=1 for model in FEDformer do for preLen in 1 do python -u run.py \ --is_training 1 \ --root_path ./dataCSV/ \ --data_path niuyueu0.csv \ --task_id niuyueu0 \ --model $model \ --data ETTh1 \ --features M \ --freq 'h' \ --seq_len 12 \ --label_len 12 \ --pred_len $preLen \ --e_layers 2 \ --d_layers 1 \ --factor 3 \ --enc_in 14 \ --dec_in 14 \ --n_heads 15 \ --c_out 14 \ --des 'Exp' \ --d_model 128 \ --itr 3 \ done done
输入包含日期共14个,输出1个,我想知道是不是有什么参数配置有问题。
或者出现错误的原因。
望解答,谢谢。

@yhy13344
Copy link

我也遇到这个问题,请问有解决办法吗

@tianzhou2011
Copy link
Collaborator

你要跑的这个setting是MS的setting, 用multi预测uni, 我们并没有跑过任何MS的实验,这个实验设置只是开始想着可能要做,放在代码里的,最后没写这部分。如果你要做MS的实验的话要自己稍微修改一下这部分代码,换一下MLP映射输出的维度

@jannichorst
Copy link

Maybe try to set n_heads = 8. All attempts from my side to change that parameter to something other than 8 failed.

@Hadar933
Copy link

Hadar933 commented Jan 28, 2024

the problem lies with how the weights argument is defined, having a first dimension value of 8, regardless of the number of heads. One could potentially set this value to self.n_heads but this will not solve the problem completely, as in future steps of the forward method there is a dimension incompatability.

@efg001
Copy link

efg001 commented Jul 7, 2024

@tianzhou2011 Are you sure code changes is needed?
encoder, decoder and model output sizes are all configureable.

The following script works for me for MS setting.
I think informer did something similar too and Fedformer shares a similar architecture.
https://github.com/zhouhaoyi/Informer2020/blob/0ac81e04d4095ecb97a3a78c7b49c936d8aa9933/main_informer.py#L84

Let me know if I am missing something here : )

export CUDA_VISIBLE_DEVICES=0

#cd ..

for model in FEDformer Autoformer Informer Transformer
do

for preLen in 96
do

# ETTh1
python -u run.py \
  --is_training 1 \
  --root_path ./dataset/ETT-small/ \
  --data_path ETTh1.csv \
  --task_id ETTh1 \
  --model $model \
  --data ETTh1 \
  --features MS \
  --seq_len 96 \
  --label_len 48 \
  --pred_len $preLen \
  --e_layers 2 \
  --d_layers 1 \
  --factor 3 \
  --enc_in 7 \
  --dec_in 7 \
  --c_out 1 \
  --des 'Exp' \
  --d_model 512 \
  --patience 11 \
  --itr 1 &

done

done

>>>>>>>testing : ETTh1_Autoformer_random_modes64_ETTh1_ftMS_sl96_ll48_pl96_dm512_nh8_el2_dl1_df2048_fc3_ebtimeF_dtTrue_Exp_0<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
test 2785
Epoch: 9, Steps: 264 | Train Loss: 0.5382350 Vali Loss: 1.0745947 Test Loss: 1.9351215
EarlyStopping counter: 8 out of 11
Updating learning rate to 3.90625e-07
test shape: (87, 32, 96, 7) (87, 32, 96, 1)
test shape: (2784, 96, 7) (2784, 96, 1)
mse:2.252048969268799, mae:1.3310285806655884
>>>>>>>testing : ETTh1_FEDformer_random_modes64_ETTh1_ftMS_sl96_ll48_pl96_dm512_nh8_el2_dl1_df2048_fc3_ebtimeF_dtTrue_Exp_0<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
test 2785
test shape: (87, 32, 96, 7) (87, 32, 96, 1)
test shape: (2784, 96, 7) (2784, 96, 1)
mse:2.6500980854034424, mae:1.4623867273330688
>>>>>>>testing : ETTh1_Informer_random_modes64_ETTh1_ftMS_sl96_ll48_pl96_dm512_nh8_el2_dl1_df2048_fc3_ebtimeF_dtTrue_Exp_0<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
test 2785
Epoch: 9, Steps: 264 | Train Loss: 0.4136689 Vali Loss: 0.9972279 Test Loss: 2.2299614
EarlyStopping counter: 7 out of 11
Updating learning rate to 3.90625e-07
test shape: (87, 32, 96, 1) (87, 32, 96, 1)
test shape: (2784, 96, 1) (2784, 96, 1)
mse:1.0076711177825928, mae:0.9120587110519409
>>>>>>>testing : ETTh1_Transformer_random_modes64_ETTh1_ftMS_sl96_ll48_pl96_dm512_nh8_el2_dl1_df2048_fc3_ebtimeF_dtTrue_Exp_0<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
test 2785
test shape: (87, 32, 96, 1) (87, 32, 96, 1)
test shape: (2784, 96, 1) (2784, 96, 1)
mse:0.9715445041656494, mae:0.9443475604057312

@tianzhou2011
Copy link
Collaborator

It is configurable. However, I don't believe the default method, like Informer, is optimal. The historical data for the target channel contains significantly more signal compared to other channels. I would definitely make some design changes to the model for this MS task. But that's a completely different story, and it's one of the reasons we didn't pursue this task at that time........

@efg001
Copy link

efg001 commented Jul 7, 2024

makes sense that would explain the degraded performance when switched to MS mode.
Most interesting/practical time series forecasting problems though are multivariable to single variable.
I will do some reading on this try and see if I can find something. thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants