Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

global input data download #1

Open
QianqianHan96 opened this issue Jun 15, 2023 · 7 comments
Open

global input data download #1

QianqianHan96 opened this issue Jun 15, 2023 · 7 comments

Comments

@QianqianHan96
Copy link
Collaborator

QianqianHan96 commented Jun 15, 2023

Hi, Sarah

Do you know how to check the available project storage space on snellius? I only can see the size of the whole work1 folder, but can not see the available space for einf2480. I can see how much we used. The total size of ERA5-Land for 20 years will be around 18TB. Maybe better to download 5 years and test on 5 cores, how do you think?
@SarahAlidoost
image
image

@SarahAlidoost
Copy link
Member

Hi, Sarah

Do you know how to check the available project storage space on snellius?

project space is 20 TB. about 52% of this is already used. So there is about 10 TB of space left. I can remove some directories to free up space. But this won't be enough. Can you run your model for a few years of input data and copy your data to CRIB or other places?

To get limits and current usage of the relevant disk, you can use myquota command. for example the command prjspc-quota /projects/0/einf2480/ give you qouta on project space.

@SarahAlidoost
Copy link
Member

@QianqianHan96 I am not sure if you are familiar with Snellius usage and accounting. When submitting a job script, the project is charged for SBU. For one hour of usage of a full thin node, the SBU is 128, see acounting.

You can get the SBU information by command accinfo. For example, now, the SBU information for ecoextreml project is

Initial budget       : 100000:00
Used budget          : 60009:24
Remaining budget     : 39990:35. 

The minimum amount of SBU that can be set in a job script is 32. It means that even if your job is not using 32 cores, the project will be charged for 32 SBU. So, it is important to use the resources efficiently.

@SarahAlidoost
Copy link
Member

@Crystal-szj the information in this issue might be helpful for you too.

@QianqianHan96
Copy link
Collaborator Author

Hi, Sarah
Do you know how to check the available project storage space on snellius?

project space is 20 TB. about 52% of this is already used. So there is about 10 TB of space left. I can remove some directories to free up space. But this won't be enough. Can you run your model for a few years of input data and copy your data to CRIB or other places?

To get limits and current usage of the relevant disk, you can use myquota command. for example the command prjspc-quota /projects/0/einf2480/ give you qouta on project space.

Thanks for your information, Sarah. "global_data_Qianqian" is my directory. I need 18 TB for ERA5-Land data (20 years), and I have other variables maybe also several TB, but the result will be around 20TB too. How do you think for now we run 5 years on snellius for the parallel computing scaling up, now I am running for 1 computation block? Then, I copy the input and output to other places, and continue running other 15 years?

@QianqianHan96
Copy link
Collaborator Author

@QianqianHan96 I am not sure if you are familiar with Snellius usage and accounting. When submitting a job script, the project is charged for SBU. For one hour of usage of a full thin node, the SBU is 128, see acounting.

You can get the SBU information by command accinfo. For example, now, the SBU information for ecoextreml project is

Initial budget       : 100000:00
Used budget          : 60009:24
Remaining budget     : 39990:35. 

The minimum amount of SBU that can be set in a job script is 32. It means that even if your job is not using 32 cores, the project will be charged for 32 SBU. So, it is important to use the resources efficiently.

Thanks for your reminding, I saw this information on snellius website. I will be careful with using it.

@SarahAlidoost
Copy link
Member

Hi, Sarah
Do you know how to check the available project storage space on snellius?

project space is 20 TB. about 52% of this is already used. So there is about 10 TB of space left. I can remove some directories to free up space. But this won't be enough. Can you run your model for a few years of input data and copy your data to CRIB or other places?
To get limits and current usage of the relevant disk, you can use myquota command. for example the command prjspc-quota /projects/0/einf2480/ give you qouta on project space.

Thanks for your information, Sarah. "global_data_Qianqian" is my directory. I need 18 TB for ERA5-Land data (20 years), and I have other variables maybe also several TB, but the result will be around 20TB too. How do you think for now we run 5 years on snellius for the parallel computing scaling up, now I am running for 1 computation block?

it is better first to estimate how much space is needed for all input/output of 1 year. Then check the memory usage for 1 computing block. The parallel variables can be the number of years and the number of spatial units.

@QianqianHan96
Copy link
Collaborator Author

QianqianHan96 commented Jun 19, 2023

Hi, Sarah
Do you know how to check the available project storage space on snellius?

project space is 20 TB. about 52% of this is already used. So there is about 10 TB of space left. I can remove some directories to free up space. But this won't be enough. Can you run your model for a few years of input data and copy your data to CRIB or other places?
To get limits and current usage of the relevant disk, you can use myquota command. for example the command prjspc-quota /projects/0/einf2480/ give you qouta on project space.

Thanks for your information, Sarah. "global_data_Qianqian" is my directory. I need 18 TB for ERA5-Land data (20 years), and I have other variables maybe also several TB, but the result will be around 20TB too. How do you think for now we run 5 years on snellius for the parallel computing scaling up, now I am running for 1 computation block?

it is better first to estimate how much space is needed for all input/output of 1 year. Then check the memory usage for 1 computing block. The parallel variables can be the number of years and the number of spatial units.

All input/output of 1 year at global scale is around 2.5-3 TB. I will let you know the running time and memory usage when the 1 computing block finish. Now it has been running for 17 hours.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants