questions about custom grid #992
-
I'm trying to set up a custom grid (over NE US) at Derecho. I have one successful run (/glade/work/clu/ufs/expt_dirs/RRFS_nys_12km_ver1). When I tweak WRTCMP parameters in config.yaml, it fails at the post step (/glade/work/clu/ufs/expt_dirs/RRFS_nys_12km). I also have one run failed at make_ics and make_lbcs with slightly modified WRTCMP parameters. While these WRTCMP parameters are described in the documentation, it's not clear how to make these parameters consistent with the configuration parameters in task_make_grid. Apparently, I need some guidance on setting up grid points for the write-component grid. -Thanks - |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
Hi @SarahLu-NOAA , Just wanted to let you know that I passed this question along to one of our subject matter experts, and he should be responding to you soon! Best, |
Beta Was this translation helpful? Give feedback.
-
Sarah, in this case, the WRTCMP change causing MPI issues in UPP is a bit of a red herring: you've just happened to shrink your domain enough that over-decomposition is becoming a problem (too many processors for the domain size). You will need to assign fewer MPI tasks to run_post; you can do this by updating the
This will give UPP 12 processors instead of 48, which should be good enough to avoid the over-decomposition problem. I would also recommend reducing the MPI processors assigned for the forecast step if you will be running such a small (in number of grid points) domain: even though the larger nodes of Derecho allow for the use of large numbers of processors, for very small domains you will actually see reduced performance by using all of these processors, with much longer runtimes due to additional halo communications needed for very small MPI patches. I would recommend reducing LAYOUT_X and LAYOUT_Y by half; and honestly going even smaller than that might be called for with such a small domain. Furthermore, note the difference between the write component and the compute grid. The WRTCMP settings apply only to the write component (model output files); the compute grid (where the atmospheric integration takes place) is still the same, so reducing the size of the write component variables should typically be accompanied by reducing the ESGgrid_NX and ESGgrid_NY by a similar amount. Unfortunately there are no automated tools for comparing the compute grid to the write grid at this time, so this may require some trial-and-error to match the write grid to the compute grid. Please let me know if you have further questions! |
Beta Was this translation helpful? Give feedback.
BLOCKSIZE in general shouldn't be changed by a user on a known platform. I can't remember the exact specifics but it's a setting for the model only and I believe it's related to how memory is chunked at the processor level.
I assume any failures in make_ics or make_lbcs would also be related to over-decomposition in this case, you can change those with similar settings in the
rocoto
section undermetatask_run_ensemble:
:You can set …