Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parallel computing experiment #3

Open
QianqianHan96 opened this issue Jun 20, 2023 · 3 comments
Open

parallel computing experiment #3

QianqianHan96 opened this issue Jun 20, 2023 · 3 comments

Comments

@QianqianHan96
Copy link
Collaborator

QianqianHan96 commented Jun 20, 2023

I did three different runs below, on different number of timesteps, for 100 timesteps and 1 month, it is successful, but for 1 year, still having the error.

  1. 1 spatial unit (5 degree * 5 degree), 7 variables, 100 timestep, on one core. The parameter for parallel execution is the Number of years = 3.
    image

  2. 1 compute block (1 spatial unit, 7 variables, 1 month) on 1core. The parameter for parallel execution is the Number of years = 3.

image

  1. 1 compute block (1 spatial unit, 7 variables, 1 year) on 1core. The parameter for parallel execution is the Number of years = 3.

However I did not manage to get the result, it throw the error. Do you know the possible reason? The log file is at /projects/0/einf2480/global_data_Qianqian/slurm-2933096.out

image
image

@QianqianHan96
Copy link
Collaborator Author

QianqianHan96 commented Jun 21, 2023

I tried to run November and December, both failed with this recursion error. The only difference between Jan and Nov, Dec is Jan result has values, but Nov and Dec are all nan because two input variables are all nan except in Jan.

@QianqianHan96
Copy link
Collaborator Author

QianqianHan96 commented Jun 22, 2023

I also tried Feb, also failed because two inputs of Feb are all nan values, so the predicted result are all nan values too. After I replace the two inputs with values, it succeed in Feb. The error happens when I call result_LE.values after the prediction loop finish. Later I tried to figure this error out in 2read10kminput-halfhourly-0608py.ipynb, see the README.md in 1 computationBlockTest.
image

@QianqianHan96
Copy link
Collaborator Author

QianqianHan96 commented Jun 22, 2023

So the 32 hours for 1 computation block is based on only Jan has data on Rin and Rli, other months are all nan values in Rin and Rli, line 151 only run for range(745) which is for Jan, but 1 year should be 17520. After I make it 17520, it takes longer time to run (every timestep for predicting 20 seconds now, 4 seconds with range(745)). So now I am trying to run 6 months to be sure not exceed the timelimit (Job 2957047).
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant