Run fails with seg fault (invalid memory reference) #528

tovogt · 2021-10-11T16:38:00Z

While tracing down issue #525, I found a GeoClaw setup that crashes due to a segmentation fault on our cluster computer with 16 OpenMP threads, but doesn't crash on my local desktop computer. I'm running GeoClaw a lot on our cluster and never had a segmentation fault before. Here is the terminal output: stdout.txt

Have you ever seen something like this?

Maybe it's related to the following AMR settings:

7                    =: amr_levels_max      
2 2 2 2 2 4          =: refinement_ratios_x

I was adding the seventh AMR level when the error occurred for the first time. When using only 6 levels, the run will not fail.

The text was updated successfully, but these errors were encountered:

mjberger · 2021-10-11T17:34:22Z

This is a shot in the dark, but you may have to set KMP_STACKSIZE to a larger number with so many threads and grids. — Marsha

…

On Oct 11, 2021, at 12:38 PM, Thomas Vogt ***@***.***> wrote: While tracing down issue #525 <#525>, I found a GeoClaw setup that crashes due to a segmentation fault on our cluster computer with 16 OpenMP threads, but doesn't crash on my local computer. I'm running GeoClaw a lot on our cluster and never had a segmentation fault before. Here is the terminal output: stdout.txt <https://github.com/clawpack/geoclaw/files/7324009/stdout.txt> Have you ever seen something like this? Maybe it's related to the following AMR settings: 7 =: amr_levels_max 2 2 2 2 2 4 =: refinement_ratios_x I was adding the seventh AMR level when the error occurred for the first time. When using only 6 levels, the run will not fail. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#528>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGUGCYAQZELT4O4KALEWW3UGMHHHANCNFSM5FYUIRVA>. Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

mandli · 2021-10-11T18:02:21Z

I was thinking that may be a thing to check as well. There's some info in the clawpack docs although I am not sure that's up to date anymore.

At the very least if you have the time I would try a few things:

Using fewer threads. This can test the stack size idea.
Compiling with debugging and stack-tracing turned on. I commonly will use-O0 -W -Wall -fbounds-check -fcheck=all -Wunderflow -fbacktrace -ffpe-trap=invalid,zero,overflow -g. You can also keep on using OpenMP with this although the stack-trace may get confusing.
Send us the fort.amr file from the _output directory as that will have some statistics regarding the grids that might be helpful. Also turning on some more verbosity in the output from refinement and output will give you more info about the number of grids and refinement characteristics.

tovogt · 2021-10-13T13:06:12Z

As always, thanks a lot for your quick responses! 🙏

Reducing the number of threads doesn't change the behavior, even when setting the number to 1. The code fails at the exact same step in the computation, and I get the stacktrace in the logs for a single thread in this case. The value of slimit -s has been unlimited anyways, and increasing the stack size for OpenMP to 32M or even 1G doesn't change the behavior either.
Here is the log output with increased verbosity, 8 threads, stacksize set to 1G: stdout.txt fort.amr.txt
With debugging and stack-tracing turned on (and only one OpenMP thread to keep the stack-trace clean): stdout.txt The error message is now much clearer:

At line 39 of file $CLAW/amrclaw/src/2d/fluxsv.f
Fortran runtime error: Index '0' of dimension 2 of array 'node' below lower bound of 1

Error termination. Backtrace:
#0  0x4955c9 in fluxsv_
	at $CLAW/amrclaw/src/2d/fluxsv.f:39
#1  0x53aa6c in par_advanc_
	at $CLAW/geoclaw/src/2d/shallow/advanc.f:260
#2  0x53cd6c in advanc_._omp_fn.1
	at $CLAW/geoclaw/src/2d/shallow/advanc.f:123
#3  0x2b967165eaae in GOMP_parallel
	at ../.././libgomp/parallel.c:168
#4  0x53b90e in advanc_
	at $CLAW/geoclaw/src/2d/shallow/advanc.f:124
#5  0x52b741 in tick_
	at $CLAW/geoclaw/src/2d/shallow/tick.f:303
#6  0x5440a8 in amr2
	at $CLAW/geoclaw/src/2d/shallow/amr2.f90:646
#7  0x5477aa in main
	at $CLAW/geoclaw/src/2d/shallow/amr2.f90:59

mjberger · 2021-10-13T13:38:20Z

That sounds like an actual bug, in a complicated place. If you send me your setup I will try to reproduce it and take a look. What version of Clawpack are you running with? — Marsha

…

On Oct 13, 2021, at 9:06 AM, Thomas Vogt ***@***.***> wrote: As always, thanks a lot for your quick responses! Reducing the number of threads doesn't change the behavior, even when setting the number to 1. The code fails at the exact same step in the computation, and I get the stacktrace in the logs for a single thread in this case. The value of slimit -s has been unlimited anyways, and increasing the stack size for OpenMP to 32M or even 1G doesn't change the behavior either. Here is the log output with increased verbosity, 8 threads, stacksize set to 1G: stdout.txt <https://github.com/clawpack/geoclaw/files/7337079/stdout.txt> fort.amr.txt <https://github.com/clawpack/geoclaw/files/7337066/fort.amr.txt> With debugging and stack-tracing turned on (and only one OpenMP thread to keep the stack-trace clean): stdout.txt <https://github.com/clawpack/geoclaw/files/7338190/stdout.txt> The error message is now much clearer: At line 39 of file $CLAW/amrclaw/src/2d/fluxsv.f Fortran runtime error: Index '0' of dimension 2 of array 'node' below lower bound of 1 Error termination. Backtrace: #0 0x4955c9 in fluxsv_ at $CLAW/amrclaw/src/2d/fluxsv.f:39 #1 0x53aa6c in par_advanc_ at $CLAW/geoclaw/src/2d/shallow/advanc.f:260 #2 0x53cd6c in advanc_._omp_fn.1 at $CLAW/geoclaw/src/2d/shallow/advanc.f:123 #3 0x2b967165eaae in GOMP_parallel at ../.././libgomp/parallel.c:168 #4 0x53b90e in advanc_ at $CLAW/geoclaw/src/2d/shallow/advanc.f:124 #5 0x52b741 in tick_ at $CLAW/geoclaw/src/2d/shallow/tick.f:303 #6 0x5440a8 in amr2 at $CLAW/geoclaw/src/2d/shallow/amr2.f90:646 #7 0x5477aa in main at $CLAW/geoclaw/src/2d/shallow/amr2.f90:59 — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#528 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGUGC2676DR2PZGAGD4PW3UGV745ANCNFSM5FYUIRVA>. Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

mandli · 2021-10-13T14:16:30Z

That is interesting. At least it's not a parallel bug. Hopefully the intel compiler can reproduce it as well. I do not suppose you have that compiler handy @tovogt?

tovogt · 2021-10-13T15:02:42Z

This is with a clean checkout of the clawpack master branch and its submodules, commit clawpack/clawpack@bf55916 which is basically version 5.8.0. Only for the geoclaw submodule, I use the most recent master (e8d0bf3) instead of the submodule specification from clawpack (which would be 01c9a8e).

Furthermore, the runs on our cluster are with gfortran 7.3.0 (OpenMP 4.5/201511) and linux kernel version 4.4. But I also reproduced this with gfortran 10.2.0 (same version of OpenMP).

Here are the rundata files: gc_segfault_rundata.zip Before running make .output on your machine, make sure to specify the absolute path to your work directory in topo.data and surge.data.

Regarding your question about an intel compiler: We have ifort version 17.0.0 on our cluster, will try that later.

tovogt · 2021-10-13T15:16:32Z

In fact, I can't reproduce this error with the intel compiler (ifort 17.0.0)!

mandli · 2021-10-13T17:17:02Z

Hmmm. Well we will have to see what we find might be going on with gfortran then.

tovogt · 2021-11-02T09:24:19Z

Since the last activity in this issue, I mostly continued working with gfortran. But yesterday, I started a whole bunch of runs with Intel's ifort compiler (tested both version 17 and 19, with ulimit -s unlimited and OMP_STACKSIZE=500M) and found that a lot of those runs were failing due to segmentation faults. Output with -traceback compiler flag:

forrtl: severe (408): fort: (3): Subscript #2 of the array NODE has value -2 which is less than the lower bound of 1

Apparently, the ifort compiler is also affected by this issue, but it fails at a different point and for different input data. By the way, the same runs will fail at a different point in the simulation when run with gfortran (backtrace same as in my post above).

To be honest, I'm a bit puzzled by this: The topography I provide to GeoClaw has a resolution of 30 arc-seconds (~1 km) and the mesh resolution with maximum refinement is 7 arc-seconds (200 m). In the TC literature, in the TC examples and from what I hear from other GeoClaw users (with TC context), a lot of them use much higher resolution. How is it possible that I still run into these problems while apparently it works for other people? I even talked to people that use refinement_ratios_x = [2, 2, 2, 6, 8, 8] (which is orders of magnitude higher resolution than what I use) on the same hardware (16 cores with 4 GB memory each). Nobody ever mentioned any issues like the one discribed here or in #525.

mandli · 2021-11-02T21:58:46Z

My guess is this is not a problem with topography resolution but perhaps a bug although the fact that you see it this often is a bit puzzling to me as well. My best guess without looking into it with @mjberger further is that there is something about your location that is causing the problem but who knows. If you can send us your setup as @mjberger suggested we can try and figure out what's going wrong hopefully.

tovogt · 2021-11-03T07:36:50Z

I described the setup in #528 (comment) and provided the rundata files. Do you need any other information from my side?

mandli · 2021-11-03T23:24:50Z

Sorry, forgot about that. We will try and take a look and see if we can reproduce and debug the problem.

mjberger · 2021-11-06T14:19:12Z

HI Thomas, I've started looking at your issue. I've duplicated the error with the gfortran-8 compiler. Interestingly it runs to completion with -g, and only dies with optimization -O0 on. Makes it harder to debug but will give it a try. Anyhow, 2 things. 1. Your zip file had an empty setrun.py. It would make it easier to have one, e.g. so I could more easily checkpoint before the error, etc. 2. Why don't you send me your direct email address , so we don't have to use the clawpack notifications. Mine is ***@***.*** ***@***.***> Best, — Marsha

…

On Oct 13, 2021, at 11:02 AM, Thomas Vogt ***@***.***> wrote: This is with a clean checkout of the clawpack master branch and its submodules, commit ***@***.*** <clawpack/clawpack@bf55916> which is basically version 5.8.0. Only for the geoclaw submodule, I use the most recent master (e8d0bf3 <e8d0bf3>) instead of the submodule specification from clawpack (which would be 01c9a8e <01c9a8e>). Furthermore, the runs on our cluster are with gfortran 7.3.0 (OpenMP 4.5/201511) and linux kernel version 4.4. But I also reproduced this with gfortran 10.2.0 (same version of OpenMP). Here are the rundata files: gc_segfault_rundata.zip <https://github.com/clawpack/geoclaw/files/7338799/gc_segfault_rundata.zip> Before running make .output on your machine, make sure to specify the absolute path to your work directory in topo.data and surge.data. Regarding your question about an intel compiler: We have ifort version 17.0.0 on our cluster, will try that later. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#528 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGUGCZDRYXW6FYCWTCLZN3UGWNR5ANCNFSM5FYUIRVA>. Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

Added back zeroing out of cfluxptr on coarse grids that was removed in PR clawpack#268. Also correctly typed old_memsize as integer.

tovogt · 2021-11-12T14:44:08Z

After stress testing this with a whole batch of jobs, I only have a single run that fails with a segfault (with gfortran). But the error message changed, so it might be completely unrelated to this issue:

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0  0x2ae8cc720fdf in ???
#1  0x5173ca in gfixup_
	at $CLAW/geoclaw/src/2d/shallow/gfixup.f:234
#2  0x49690d in regrid_
	at $CLAW/amrclaw/src/2d/regrid.f:71
#3  0x50f081 in tick_
	at $CLAW/geoclaw/src/2d/shallow/tick.f:226
#4  0x525304 in amr2
	at $CLAW/geoclaw/src/2d/shallow/amr2.f90:646
#5  0x52873c in main
	at $CLAW/geoclaw/src/2d/shallow/amr2.f90:59
Traceback (most recent call last):
  File "$CLAW/clawutil/src/python/clawutil/runclaw.py", line 249, in runclaw
    proc = subprocess.check_call(cmd_split,
  File "$CONDA_PREFIX/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['$WORKDIR/xgeoclaw']' died with <Signals.SIGFPE: 8>.

Here is the setup: gc_segfpe.zip

Would you prefer to have a separate issue for this or continue to discuss this here?

mjberger · 2021-11-12T14:56:30Z

what compiler options and compiler? Both? — Marsha

…

On Nov 12, 2021, at 9:44 AM, Thomas Vogt ***@***.***> wrote: After stress testing this with a whole batch of jobs, I only have a single run that fails with a segfault (with gfortran). But the error message changed, so it might be completely unrelated to this issue: Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation. Backtrace for this error: #0 0x2ae8cc720fdf in ??? #1 0x5173ca in gfixup_ at $CLAW/geoclaw/src/2d/shallow/gfixup.f:234 #2 0x49690d in regrid_ at $CLAW/amrclaw/src/2d/regrid.f:71 #3 0x50f081 in tick_ at $CLAW/geoclaw/src/2d/shallow/tick.f:226 #4 0x525304 in amr2 at $CLAW/geoclaw/src/2d/shallow/amr2.f90:646 #5 0x52873c in main at $CLAW/geoclaw/src/2d/shallow/amr2.f90:59 Traceback (most recent call last): File "$CLAW/clawutil/src/python/clawutil/runclaw.py", line 249, in runclaw proc = subprocess.check_call(cmd_split, File "$CONDA_PREFIX/lib/python3.8/subprocess.py", line 364, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['$WORKDIR/xgeoclaw']' died with <Signals.SIGFPE: 8>. Here is the setup: gc_segfpe.zip <https://github.com/clawpack/geoclaw/files/7528364/gc_segfpe.zip> Would you prefer to have a separate issue for this or continue to discuss this here? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#528 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGUGC2OQPTUUG6WNC5ZC2DULUR4FANCNFSM5FYUIRVA>. Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

tovogt · 2021-11-12T15:23:42Z

The above error output is with gfortran 7.3.0 and the following flags:

FFLAGS='-fopenmp -O0 -W -Wall -fbounds-check -fcheck=all -Wunderflow -fbacktrace -ffpe-trap=invalid,zero,overflow -g'

I currently try to reproduce this with ifort. The process has been running for 7 hours without failing, but it's not finished yet ...

mandli · 2021-11-12T16:56:41Z

I am running it through my setup with checkpoints turned on. I will inform if I can reproduce the problem.

mandli · 2021-11-12T16:57:20Z

Oh, and due to the location of the signal I am somewhat suspicious that this may be a related problem.

tovogt · 2021-11-15T10:30:04Z

Unfortunately, I still can't tell whether this also applies to the Intel compiler because the process has been cancelled by our cluster after 24 hours due to a time limit. I had to restart with a longer time limit and that will take a while now...

Note that the original gfortran run took 65 minutes to fail, and 3 hours 45 minutes with debugging flags. I just started another Intel compiler run without debugging flags so that we can see more quickly whether it fails at all.

mjberger · 2021-11-15T13:14:32Z

the intel compiler was the one that the stack size helped for me, on my mac though. with the gfortran-8 compiler never broke. We have a checkpoint option that might help with these long runs. It alternative between saving , and then overwriting, between 2 checkpoints. You set the time interval of the checkpointing. That way you could restart from the last one. — Marsha

…

On Nov 15, 2021, at 5:30 AM, Thomas Vogt ***@***.***> wrote: Unfortunately, I still can't tell whether this also applies to the Intel compiler because the process has been cancelled by our cluster after 24 hours due to a time limit. I had to restart with a longer time limit and that will take a while now... — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#528 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGUGC73RQXAEF2E3DK5GP3UMDOLNANCNFSM5FYUIRVA>. Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

mandli · 2021-11-15T14:13:58Z

I did finish the run on my laptop, it took nearly 48 hours on an 8 core i9. Looking at the output it looks like the run is refining everywhere, even after the storm is well inland. One thing that I am somewhat concerned with is the size of the domain relative to the storm is quite small, which can cause boundary condition problems.

tovogt · 2021-11-15T14:33:14Z

When you say "the size of the domain relative to the storm is quite small", what do you take as a rule of thumb to determine an appropriate domain size?

Currently, I select the domain to accommodate the storm_radius, which is orders of magnitude larger than the max_wind_radius storm variable. I assumed that most surge dynamics would be concentrated within that area. Would you go for twice the storm_radius? Or three times, or 10 times, ...?

My experience with larger domain sizes is that it will increase runtimes in most cases dramatically, won't change the results in the landfall area of the storm significantly, and will sometimes cause GeoClaw to spend a lot of time computing complex interaction of dynamics with topographic features that are at a large distance from the region of interest.

mandli · 2021-11-15T14:45:16Z

I often run with large domains where the storm is at least 2-3 storm_radius interior to the domain but lately I have also had storm_radius set to a larger number as that parameter is often not provided. The key here though is to restrict the resolution in most of the domain to be low but keeping the resolution criteria free to refine wither only in the near shore or, even better, only along the track of the storm.

tovogt · 2021-11-15T16:24:04Z

Thanks, Kyle, this is really very helpful!

In fact, in this case, enlarging the domain to 2.5 storm_radius and enforcing a low resolution outside of a quite narrow window around the storm, does help: There is no segmentation fault and the whole simulation finishes on our cluster in less than 25 minutes (previously, it ran for over an hour until the segmentation fault happened).

Still it's a bit puzzling that you don't seem to be able to reproduce the latest segfault I reported about. 😞

mandli · 2021-11-15T16:52:49Z

Yes, that's still worrying but I wonder now if it's somehow related to either length of time or amount of memory that the process is requesting.

mjberger · 2021-11-15T16:54:31Z

For me with the ifort compiler it was the memory size. The code died in an unusual place, and was able to run through it when I enlarged the stack size limit. — Marsha

…

On Nov 15, 2021, at 11:52 AM, Kyle Mandli ***@***.***> wrote: Yes, that's still worrying but I wonder now if it's somehow related to either length of time or amount of memory that the process is requesting. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#528 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGUGC3R7DRQ7O37XXAOVQLUME3GXANCNFSM5FYUIRVA>. Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

tovogt · 2021-11-15T17:04:30Z

Currently, I use the following settings:

export OMP_STACKSIZE=500M
export GOMP_STACKSIZE=$OMP_STACKSIZE
export KMP_STACKSIZE=$OMP_STACKSIZE

ulimit -t unlimited              # cputime
ulimit -f unlimited              # filesize
ulimit -d unlimited              # datasize
ulimit -s unlimited              # stacksize
ulimit -c unlimited              # coredumpsize
ulimit -v unlimited              # vmemoryuse
ulimit -l unlimited              # memorylocked

I will rerun with gfortran and a higher OMP_STACKSIZE and see whether it still fails or whether the time of failure changes.

mandli · 2021-11-15T17:11:23Z

I definitely had upwards of 4-5 GB of memory dedicated to the run so the stack size problem may well be the issue. The variables GOMP_STACKSIZE and KMP_STACKSIZE I have seen override the ulimit command so I am now wondering if that is the real problem.

tovogt · 2021-11-15T17:46:05Z

Sure, in my setup, the threads also have 4 GB of memory each. But isn't the stack size a different thing and will typically not be larger than a few megabytes? On the contrary, I thought that a larger stack size will reduce the amount of available "heap" memory (because memory is reserved for the stack), so that it's also not advisable to increase the stacksize indefinitely.

tovogt · 2021-11-16T11:24:54Z

I can confirm that this runs through with the Intel compiler (even with STACKSIZE set to 500M, as above) after almost 19 hours (!).

Furthermore, running with gfortran and a larger STACKSIZE (1000M), there won't be a segmentation fault. Still, the run will fail after 68 minutes with

 **** Too many dt reductions ****
 **** Stopping calculation   ****

How is it possible that the behavior of GeoClaw depends on the compiler in such a fundamental way? It's not only the performance that is different, but the numerical stability seems to be different, as well.

mjberger · 2021-11-16T18:42:35Z

So the new problem is that ifort compiler works, and gfortran fails with too many dt reductions (a completely separate problem than the seg vi)? The compiler issue is even worse than you thought. Did you know that the intel compilers generate code that is not repeatable unless you use "fp-model precise" See this link: https://www.intel.com/content/dam/develop/external/us/en/documents/fp-consistency-102511-326704.pdf — Marsha

…

On Nov 16, 2021, at 6:25 AM, Thomas Vogt ***@***.***> wrote: I can confirm that this runs through with the Intel compiler (even with STACKSIZE set to 500M, as above). Furthermore, running with gfortran and a larger STACKSIZE (1000M), there won't be a segmentation fault. Still, the run will fail with **** Too many dt reductions **** **** Stopping calculation **** How is it possible that the behavior of GeoClaw depends on the compiler in a such a fundamental way? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#528 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGUGC34JG4RG47Q7G2LAXDUMI5RBANCNFSM5FYUIRVA>. Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

mandli · 2021-11-16T19:55:54Z

Optimization flags between compilers can be very different and at times can lead to incorrect mathematics being produced, enough so that impacts us. It's unfortunately a bit of a moving target as it does change between versions of compilers. My best guess as to why the stack size thing is happening is that gfortran is doing stack memory different (and less efficiently) than in ifort.

For the dt reductions I would hazard also that the run is blowing up. Not sure if you can plot the problem to see.

tovogt · 2021-11-17T17:25:34Z

Yes, maybe let's agree that this issue can be closed once clawpack/amrclaw#272 is merged (thanks again for that!).

As for the too many dt reductions, your remark about enlarging the domain while setting a maximum level of refinement far away from the storm helped a lot already.

mandli · 2021-11-17T23:22:18Z

Great to hear! As you suggested, once clawpack/amrclaw#272 is merged let's close this and open up something else if needed.

Fixed bug reported in clawpack/geoclaw#528

mandli added a commit to mandli/amrclaw that referenced this issue Nov 9, 2021

Fixed @tovogt bug reported in clawpack/geoclaw#528

b60007b

Added back zeroing out of cfluxptr on coarse grids that was removed in PR clawpack#268. Also correctly typed old_memsize as integer.

mandli mentioned this issue Nov 9, 2021

Fixed bug reported in clawpack/geoclaw#528 clawpack/amrclaw#272

Merged

rjleveque added a commit to clawpack/amrclaw that referenced this issue Nov 24, 2021

Merge pull request #272 from mandli/cleanup

d7acfe4

Fixed bug reported in clawpack/geoclaw#528

tovogt closed this as completed Dec 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run fails with seg fault (invalid memory reference) #528

Run fails with seg fault (invalid memory reference) #528

tovogt commented Oct 11, 2021 •

edited

Loading

mjberger commented Oct 11, 2021 via email

mandli commented Oct 11, 2021

tovogt commented Oct 13, 2021 •

edited

Loading

mjberger commented Oct 13, 2021 via email

mandli commented Oct 13, 2021

tovogt commented Oct 13, 2021

tovogt commented Oct 13, 2021

mandli commented Oct 13, 2021

tovogt commented Nov 2, 2021 •

edited

Loading

mandli commented Nov 2, 2021

tovogt commented Nov 3, 2021

mandli commented Nov 3, 2021

mjberger commented Nov 6, 2021 via email

tovogt commented Nov 12, 2021

mjberger commented Nov 12, 2021 via email

tovogt commented Nov 12, 2021 •

edited

Loading

mandli commented Nov 12, 2021

mandli commented Nov 12, 2021

tovogt commented Nov 15, 2021 •

edited

Loading

mjberger commented Nov 15, 2021 via email

mandli commented Nov 15, 2021

tovogt commented Nov 15, 2021 •

edited

Loading

mandli commented Nov 15, 2021

tovogt commented Nov 15, 2021 •

edited

Loading

mandli commented Nov 15, 2021

mjberger commented Nov 15, 2021 via email

tovogt commented Nov 15, 2021 •

edited

Loading

mandli commented Nov 15, 2021

tovogt commented Nov 15, 2021

tovogt commented Nov 16, 2021 •

edited

Loading

mjberger commented Nov 16, 2021 via email

mandli commented Nov 16, 2021

tovogt commented Nov 17, 2021 •

edited

Loading

mandli commented Nov 17, 2021

Run fails with seg fault (invalid memory reference) #528

Run fails with seg fault (invalid memory reference) #528

Comments

tovogt commented Oct 11, 2021 • edited Loading

mjberger commented Oct 11, 2021 via email

mandli commented Oct 11, 2021

tovogt commented Oct 13, 2021 • edited Loading

mjberger commented Oct 13, 2021 via email

mandli commented Oct 13, 2021

tovogt commented Oct 13, 2021

tovogt commented Oct 13, 2021

mandli commented Oct 13, 2021

tovogt commented Nov 2, 2021 • edited Loading

mandli commented Nov 2, 2021

tovogt commented Nov 3, 2021

mandli commented Nov 3, 2021

mjberger commented Nov 6, 2021 via email

tovogt commented Nov 12, 2021

mjberger commented Nov 12, 2021 via email

tovogt commented Nov 12, 2021 • edited Loading

mandli commented Nov 12, 2021

mandli commented Nov 12, 2021

tovogt commented Nov 15, 2021 • edited Loading

mjberger commented Nov 15, 2021 via email

mandli commented Nov 15, 2021

tovogt commented Nov 15, 2021 • edited Loading

mandli commented Nov 15, 2021

tovogt commented Nov 15, 2021 • edited Loading

mandli commented Nov 15, 2021

mjberger commented Nov 15, 2021 via email

tovogt commented Nov 15, 2021 • edited Loading

mandli commented Nov 15, 2021

tovogt commented Nov 15, 2021

tovogt commented Nov 16, 2021 • edited Loading

mjberger commented Nov 16, 2021 via email

mandli commented Nov 16, 2021

tovogt commented Nov 17, 2021 • edited Loading

mandli commented Nov 17, 2021

tovogt commented Oct 11, 2021 •

edited

Loading

tovogt commented Oct 13, 2021 •

edited

Loading

tovogt commented Nov 2, 2021 •

edited

Loading

tovogt commented Nov 12, 2021 •

edited

Loading

tovogt commented Nov 15, 2021 •

edited

Loading

tovogt commented Nov 15, 2021 •

edited

Loading

tovogt commented Nov 15, 2021 •

edited

Loading

tovogt commented Nov 15, 2021 •

edited

Loading

tovogt commented Nov 16, 2021 •

edited

Loading

tovogt commented Nov 17, 2021 •

edited

Loading