-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with information transfer between SWAN and ROMS #328
Comments
when the simulation starts, one of the first things is to do a coupling exchange. Did this happen? did you see the DISBOT exchange, something like |
Hi John, I am attaching the log file for one of the failed runs. |
can you set NINFO =1 and rerun that? |
Yes, I've set NINFO = 1, and also changed TI_OCN2WAV in coupling.in to make
it an exact multiple of the ROMS timesteps DT (just in case...) and rerun
the case. I'm attaching the new log file.
The error I mention is not printed to the run log file, but is printed to
the screen or in the slurm out file, which I also attach.
On the other hand, there is no error file generated by SWAN; I've checked
the SWAN PRINT files and they all look fine to me, without any error
messages.
Missatge de john warner ***@***.***> del dia dj., 24 d’oct.
2024 a les 14:40:
… can you set NINFO =1 and rerun that?
there is a lot of info that is not being printed to that file.
is there also an error out file?
the error you report is not in that file.
i really think that roms may have blown up, and you are not seeing that
written to the screen.
—
Reply to this email directly, view it on GitHub
<#328 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BMKJJDOWV4NJYJVFRAY2PRTZ5DTDRAVCNFSM6AAAAABQL7I4EKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMZVGE4DMMRUGU>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
oh. yes you need to have dt roms divide evenly into the coupling interval. also, the log file was not attached |
True, sorry. I attach it now. |
this is strange. at the beginning all the models exchange: == SWAN grid 1 sent wave data to ROMS grid 1 then and then swan to 30 mintues then you get that error but disbot already existed. Can i see your swan.in? i am not sure why you have so many interations per step. when you run it, try can you look in the roms his file? can i see your |
John,
Apologies for the late reply. I am attaching the swan ini files for both
domains, together with the log of the new run using your suggestion (mpirun
-np 30 coawstM coupling.in &> test.log). To answer you questions, the data
in the history files makes sense, and yes, the run always crashes the
second time it tries to exchange information between models.
In the meantime, I have run the exact same simulation with the same version
of COAWST on a different cluster, and it has worked properly, so it seems
that the problem would not be the input files, but the machine itself or
some compilation option. I would discard the latter since the
build_coawst.bash I use is the same as the one for the Inlet Test Refined
case, which runs fine.
M
Missatge de john warner ***@***.***> del dia dl., 28 d’oct.
2024 a les 15:41:
… this is strange. at the beginning all the models exchange:
== SWAN grid 1 sent wave data to ROMS grid 1
** ROMS grid 1 recv data from SWAN grid 1
SWANtoROMS Min/Max DISBOT (Wm-2): 0.000000E+00 0.000000E+00
SWANtoROMS Min/Max DISSURF (Wm-2): 0.000000E+00 0.000000E+00
...
then
roms goes to 30 minutes
100 2022-01-01 00:30:00.00 2.199223E-03 3.167227E+02 3.167249E+02
8.086213E+10 01
(081,082,20) 0.000000E+00 2.834716E-03 2.372127E+00 1.497969E-01
and then swan to 30 mintues
+time 20220101.003000 , step 3; iteration 12; sweep 4 grid 2
== SWAN grid 1 sent wave data to ROMS grid 1
then you get that error
MCT::m_AttrVect::indexRA_:: FATAL--attribute not found: "DISBOT" Traceback:
|X|MCT::m_AttrVect::indexRA_
01B.MCT(MPEU)::die.: from MCT::m_AttrVect::indexRA_()
[gs30r3b04:3453727:0:3453727] Caught signal 11 (Segmentation fault: Sent
by the kernel at address (nil))
but disbot already existed.
Can i see your swan.in? i am not sure why you have so many interations
per step.
when you run it, try
mpirun -np 30 coawstM coupling.in &> test.log
can you look in the roms his file?
can you cahnge the coupling to be every 10 min?
does it always stop at the first coupling exchange (after init).
-j
can i see your
—
Reply to this email directly, view it on GitHub
<#328 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BMKJJDIHYGGJMP3ADXJD7RTZ5ZEJZAVCNFSM6AAAAABQL7I4EKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBRG44DGNBSG4>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Here come the files, with the SWAN *.in files renamed to *.txt |
Dear all,
I am using the 3.8 version to try to replicate a case that worked fine with the previous one (3.7), but I am running into an unexpected problem.
I have a smaller domain B nested into a larger A grid, and I want to run a coupled ROMS+SWAN two-way nesting simulation. All the input files are from the successful v3.7 run, so I assume they are ok. However, when I run the v3.8 model it works for a short while and then crashes due to a segmentation fault. I presume this happens when SWAN is trying to send wave data to ROMS, because I get the following onscreen message
MCT::m_AttrVect::indexRA_:: FATAL--attribute not found: "DISBOT" Traceback:
|X|MCT::m_AttrVect::indexRA_
When I run the coupled models in each domain separately, i.e., ROMS+SWAN in domain A, and the same in grid B, both simulations work well, so the issue appears only when I combine both models and both domains. I have re-made the connectivity and scrip files again and again, but the problem is still there. Any idea why this could happen?
The cluster I used for the v3.7 is not the same as the one I am using now for v3.8, but on the latter the Inlet_test/Refined case runs fine so I presume the problem is not related to the COAWST installation.
Thanks
The text was updated successfully, but these errors were encountered: