numerical differences in gfortran vs. ifort and release vs. debug #154

BenjaminTJohnson · 2024-07-26T18:19:24Z

This issue captures a longstanding (and generally ignored) issue with CRTM wherein some ctest results will differ when run in release vs. debug. I don't know that there's a clear solution, but given that it only affects certain ctests, suggests that there might be a fix.

The largest difference on the order of 1e-11, so in no way would this impact anything useful.

ifort, Release:

-- Project version : 3.1.0
-- Fortran compiler : /opt/intel/oneapi/2022.1/compiler/2022.0.1/linux/bin/intel64/ifort
-- Fortran compiler flags :  -assume byterecl -fPIC
-- Build type : Release
-- Fortran compiler flags for release : -O3 -ip -unroll -inline -no-heap-arrays

ifort, Debug:

-- Project version : 3.1.0
-- Fortran compiler : /opt/intel/oneapi/2022.1/compiler/2022.0.1/linux/bin/intel64/ifort
-- Fortran compiler flags :  -assume byterecl -fPIC
-- Build type : DEBUG
-- Fortran compiler flags for debug : -O0 -g -check bounds -traceback -warn -heap-arrays -fpe-all=0 -fpe:0 -ftz -check all

gfortran, Release:

-- Project version : 3.1.0
-- Fortran compiler : /home/bjohnson/spack/opt/spack/linux-centos7-skylake_avx512/gcc-9.3.0/gcc-14.1.0-lg64yhqjdx56qc37ds2rnvguco7tkyug/bin/gfortran
-- Fortran compiler flags : -I/home/bjohnson/spack/opt/spack/linux-centos7-skylake_avx512/gcc-9.3.0/netcdf-fortran-4.6.1-l5onh6o5qivl4qkq7thsiwyn3pge3k62/include -D_REAL8_ -ffree-line-length-none
-- Build type : RELEASE
-- Fortran compiler flags for release : -O3 -funroll-all-loops -fopenmp -finline-functions

gfortran, Debug:

-- Project version : 3.1.0
-- Fortran compiler : /home/bjohnson/spack/opt/spack/linux-centos7-skylake_avx512/gcc-9.3.0/gcc-14.1.0-lg64yhqjdx56qc37ds2rnvguco7tkyug/bin/gfortran
-- Fortran compiler flags : -I/home/bjohnson/spack/opt/spack/linux-centos7-skylake_avx512/gcc-9.3.0/netcdf-fortran-4.6.1-l5onh6o5qivl4qkq7thsiwyn3pge3k62/include -D_REAL8_ -ffree-line-length-none
-- Build type : DEBUG
-- Fortran compiler flags for debug : -O0 -g -fcheck=bounds -ffpe-trap=invalid,zero,overflow -fbacktrace

The text was updated successfully, but these errors were encountered:

BenjaminTJohnson · 2024-07-26T19:43:12Z

gfortran debug vs ifort release (reference)

	 13 - test_forward_Simple_atms_n21 (NUMERICAL)
	 14 - test_forward_Simple_cris-fsr_n21 (NUMERICAL)
	 15 - test_forward_Simple_v.abi_g18 (NUMERICAL)
	 16 - test_forward_Simple_atms_npp (NUMERICAL)
	 17 - test_forward_Simple_cris399_npp (NUMERICAL)
	 18 - test_forward_Simple_v.abi_gr (NUMERICAL)
	 19 - test_forward_Simple_abi_g18 (NUMERICAL)
	 20 - test_forward_Simple_modis_aqua (NUMERICAL)
	 34 - test_forward_ClearSky_cris-fsr_n21 (Failed)
	 41 - test_forward_Aircraft_cris-fsr_n21 (Failed)
	 44 - test_forward_ScatteringSwitch_cris-fsr_n21 (Failed)
	 53 - test_forward_SOI_v.abi_g18 (Failed)
	 56 - test_forward_SOI_v.abi_gr (Failed)
	130 - test_adjoint_Simple_modis_aqua (Failed)
	140 - test_tangent_linear_Simple_cris-fsr_n21 (Failed)
	141 - test_tangent_linear_Simple_v.abi_g18 (Failed)
	143 - test_tangent_linear_Simple_cris399_npp (Failed)
	144 - test_tangent_linear_Simple_v.abi_gr (Failed)
	145 - test_tangent_linear_Simple_abi_g18 (Failed)
	146 - test_tangent_linear_Simple_modis_aqua (Failed)
	149 - test_tangent_linear_ClearSky_v.abi_g18 (Failed)
	152 - test_tangent_linear_ClearSky_v.abi_gr (Failed)

1/22 Test  #13: test_forward_Simple_atms_n21 .................***Exception: Numerical  0.14 sec
Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0  0x7f3f1d4253ff in ???
#1  0x70f9ca in __compare_float_numbers_MOD_cwt_real_double
  at /data/users/bjohnson/CRTM/CRTMv3/src/Utility/Compare_Float_Numbers.f90:697
#2  0x5db77a in __crtm_rtsolution_define_MOD_crtm_rtsolution_compare
  at /data/users/bjohnson/CRTM/CRTMv3/src/RTSolution/CRTM_RTSolution_Define.f90:672
#3  0x40c037 in test_simple
  at /data/users/bjohnson/CRTM/CRTMv3/test/mains/regression/forward/test_Simple/test_Simple.f90:280
#4  0x413a31 in main
  at /data/users/bjohnson/CRTM/CRTMv3/test/mains/regression/forward/test_Simple/test_Simple.f90:14

 2/22 Test  #14: test_forward_Simple_cris-fsr_n21 .............***Exception: Numerical  5.41 sec
Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0  0x7ffbb32d03ff in ???
#1  0x70f9ca in __compare_float_numbers_MOD_cwt_real_double
  at /data/users/bjohnson/CRTM/CRTMv3/src/Utility/Compare_Float_Numbers.f90:697
#2  0x5db77a in __crtm_rtsolution_define_MOD_crtm_rtsolution_compare
  at /data/users/bjohnson/CRTM/CRTMv3/src/RTSolution/CRTM_RTSolution_Define.f90:672
#3  0x40c037 in test_simple
  at /data/users/bjohnson/CRTM/CRTMv3/test/mains/regression/forward/test_Simple/test_Simple.f90:280
#4  0x413a31 in main
  at /data/users/bjohnson/CRTM/CRTMv3/test/mains/regression/forward/test_Simple/test_Simple.f90:14

3/22 Test  #15: test_forward_Simple_v.abi_g18 ................***Exception: Numerical  0.16 sec
Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0  0x7fd60cd223ff in ???
#1  0x70f9ca in __compare_float_numbers_MOD_cwt_real_double
  at /data/users/bjohnson/CRTM/CRTMv3/src/Utility/Compare_Float_Numbers.f90:697
#2  0x5db77a in __crtm_rtsolution_define_MOD_crtm_rtsolution_compare
  at /data/users/bjohnson/CRTM/CRTMv3/src/RTSolution/CRTM_RTSolution_Define.f90:672
#3  0x40c037 in test_simple
  at /data/users/bjohnson/CRTM/CRTMv3/test/mains/regression/forward/test_Simple/test_Simple.f90:280
#4  0x413a31 in main
  at /data/users/bjohnson/CRTM/CRTMv3/test/mains/regression/forward/test_Simple/test_Simple.f90:14

4/22 Test  #16: test_forward_Simple_atms_npp .................***Exception: Numerical  0.15 sec
Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0  0x7f7439a223ff in ???
#1  0x70f9ca in __compare_float_numbers_MOD_cwt_real_double
  at /data/users/bjohnson/CRTM/CRTMv3/src/Utility/Compare_Float_Numbers.f90:697
#2  0x5db77a in __crtm_rtsolution_define_MOD_crtm_rtsolution_compare
  at /data/users/bjohnson/CRTM/CRTMv3/src/RTSolution/CRTM_RTSolution_Define.f90:672
#3  0x40c037 in test_simple
  at /data/users/bjohnson/CRTM/CRTMv3/test/mains/regression/forward/test_Simple/test_Simple.f90:280
#4  0x413a31 in main
  at /data/users/bjohnson/CRTM/CRTMv3/test/mains/regression/forward/test_Simple/test_Simple.f90:14

5/22 Test  #17: test_forward_Simple_cris399_npp ..............***Exception: Numerical  1.10 sec
Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0  0x7fa2828683ff in ???
#1  0x70f9ca in __compare_float_numbers_MOD_cwt_real_double
  at /data/users/bjohnson/CRTM/CRTMv3/src/Utility/Compare_Float_Numbers.f90:697
#2  0x5db77a in __crtm_rtsolution_define_MOD_crtm_rtsolution_compare
  at /data/users/bjohnson/CRTM/CRTMv3/src/RTSolution/CRTM_RTSolution_Define.f90:672
#3  0x40c037 in test_simple
  at /data/users/bjohnson/CRTM/CRTMv3/test/mains/regression/forward/test_Simple/test_Simple.f90:280
#4  0x413a31 in main
  at /data/users/bjohnson/CRTM/CRTMv3/test/mains/regression/forward/test_Simple/test_Simple.f90:14

 6/22 Test  #18: test_forward_Simple_v.abi_gr .................***Exception: Numerical  0.16 sec
Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0  0x7f9e21c933ff in ???
#1  0x70f9ca in __compare_float_numbers_MOD_cwt_real_double
  at /data/users/bjohnson/CRTM/CRTMv3/src/Utility/Compare_Float_Numbers.f90:697
#2  0x5db77a in __crtm_rtsolution_define_MOD_crtm_rtsolution_compare
  at /data/users/bjohnson/CRTM/CRTMv3/src/RTSolution/CRTM_RTSolution_Define.f90:672
#3  0x40c037 in test_simple
  at /data/users/bjohnson/CRTM/CRTMv3/test/mains/regression/forward/test_Simple/test_Simple.f90:280
#4  0x413a31 in main
  at /data/users/bjohnson/CRTM/CRTMv3/test/mains/regression/forward/test_Simple/test_Simple.f90:14

 7/22 Test  #19: test_forward_Simple_abi_g18 ..................***Exception: Numerical  0.19 sec
Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0  0x7fae6852c3ff in ???
#1  0x70f9ca in __compare_float_numbers_MOD_cwt_real_double
  at /data/users/bjohnson/CRTM/CRTMv3/src/Utility/Compare_Float_Numbers.f90:697
#2  0x5db77a in __crtm_rtsolution_define_MOD_crtm_rtsolution_compare
  at /data/users/bjohnson/CRTM/CRTMv3/src/RTSolution/CRTM_RTSolution_Define.f90:672
#3  0x40c037 in test_simple
  at /data/users/bjohnson/CRTM/CRTMv3/test/mains/regression/forward/test_Simple/test_Simple.f90:280
#4  0x413a31 in main
  at /data/users/bjohnson/CRTM/CRTMv3/test/mains/regression/forward/test_Simple/test_Simple.f90:14

 8/22 Test  #20: test_forward_Simple_modis_aqua ...............***Exception: Numerical  0.20 sec
Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0  0x7f54a91c23ff in ???
#1  0x70f9ca in __compare_float_numbers_MOD_cwt_real_double
  at /data/users/bjohnson/CRTM/CRTMv3/src/Utility/Compare_Float_Numbers.f90:697
#2  0x5db77a in __crtm_rtsolution_define_MOD_crtm_rtsolution_compare
  at /data/users/bjohnson/CRTM/CRTMv3/src/RTSolution/CRTM_RTSolution_Define.f90:672
#3  0x40c037 in test_simple
  at /data/users/bjohnson/CRTM/CRTMv3/test/mains/regression/forward/test_Simple/test_Simple.f90:280
#4  0x413a31 in main
  at /data/users/bjohnson/CRTM/CRTMv3/test/mains/regression/forward/test_Simple/test_Simple.f90:14

End of exception errors

9/22 Test  #34: test_forward_ClearSky_cris-fsr_n21 ...........***Failed    0.96 sec
> diff -y test_forward_ClearSky_cris-fsr_n21_gfortran_debug.txt test_forward_ClearSky_cris-fsr_n21_gfortran_release.txt | grep "|"
1/1 Test #34: test_forward_ClearSky_cris-fsr_n21 ...***Failed    2.07 sec		      |	1/1 Test #34: test_forward_ClearSky_cris-fsr_n21 ...***Failed    1.76 sec
CRTM_Tests    =   2.07 sec*proc (1 test)						      |	CRTM_Tests    =   1.76 sec*proc (1 test)
Total Test time (real) =   2.12 sec							      |	Total Test time (real) =   1.85 sec
----

So no difference between debug and release using gfortran.

Here's a "summary" of the differences observed for this specific test:

668K -rw-r--r--  1 bjohnson domain users 3.5M Jul 26 19:01 diff_gd_ir.txt
 512 -rw-r--r--  1 bjohnson domain users 269K Jul 26 19:01 diff_gd_id.txt
 512 -rw-r--r--  1 bjohnson domain users 3.5M Jul 26 19:01 diff_id_ir.txt
 512 -rw-r--r--  1 bjohnson domain users 269K Jul 26 19:02 diff_id_gr.txt
 512 -rw-r--r--  1 bjohnson domain users 3.5M Jul 26 19:02 diff_ir_gr.txt
 512 -rw-r--r--  1 bjohnson domain users  338 Jul 26 19:03 diff_gd_gr.txt

where gd = gfortran_debug, and ir = ifort_release`, etc.

The most differences occur when anything is compared to ifort release. Fewer differences occur when comparing gfortran to ifort debug. The only one with almost no difference is between gfortran debug and gfortran release.

Here's an example of the differences between gfortran release and ifortran release:

<...>
Radiance: num1 = 1.74107149954568E+00, num2 = 1.74107149954568E+00, percent_difference = 1.14779975713045E-13%
Brightness Temperature: num1 = 3.15166486936910E+02, num2 = 3.15166486936910E+02, percent_difference = 1.80359972322143E-14%
Stokes: num1 = 1.74107149954568E+00, num2 = 1.74107149954568E+00, percent_difference = 1.14779975713045E-13%
Up Radiance: num1 = 1.67948500636161E-01, num2 = 1.67948500636161E-01, percent_difference = 4.95787259377050E-14%
Down Radiance: num1 = 1.95273381650301E-01, num2 = 1.95273381650301E-01, percent_difference = 4.26411045597613E-14%
Down Solar Radiance: num1 = 3.68903722986171E+00, num2 = 3.68903722986171E+00, percent_difference = 2.40761576627791E-14%
Radiance: num1 = 1.73341571892138E+00, num2 = 1.73341571892138E+00, percent_difference = 5.12386272955224E-14%
Brightness Temperature: num1 = 3.15104527619051E+02, num2 = 3.15104527619050E+02, percent_difference = 3.60790873367136E-14%
Stokes: num1 = 1.73341571892138E+00, num2 = 1.73341571892138E+00, percent_difference = 5.12386272955224E-14%

The values that produced the largest percent difference:
Down Solar Radiance: num1 = 3.27014581763374E-71, num2 = 3.27014568271136E-71, percent_difference = 4.12588283764277E-06%

And example of differences between gfortran debug vs. ifort debug

<...>
Radiance: num1 = 1.93928626122432E+00, num2 = 1.93928626122432E+00, percent_difference = 4.57992426110106E-14%
Stokes: num1 = 1.93928626122432E+00, num2 = 1.93928626122432E+00, percent_difference = 4.57992426110106E-14%
Up Radiance: num1 = 1.34696913862698E-01, num2 = 1.34696913862698E-01, percent_difference = 8.24237907749579E-14%
Down Radiance: num1 = 1.49189426105494E-01, num2 = 1.49189426105494E-01, percent_difference = 5.58127536384566E-14%
Radiance: num1 = 1.90834969337093E+00, num2 = 1.90834969337093E+00, percent_difference = 5.81771269952125E-14%
Stokes: num1 = 1.90834969337093E+00, num2 = 1.90834969337093E+00, percent_difference = 5.81771269952125E-14%
Up Radiance: num1 = 1.78588101289685E-01, num2 = 1.78588101289685E-01, percent_difference = 6.21666850483101E-14%
Up Radiance: num1 = 1.93723696734729E-01, num2 = 1.93723696734729E-01, percent_difference = 5.73096138127808E-14%
Up Radiance: num1 = 1.95642859357333E-01, num2 = 1.95642859357333E-01, percent_difference = 5.67474339861994E-14%
Down Radiance: num1 = 2.29191736945382E-01, num2 = 2.29191736945382E-01, percent_difference = 4.84407963141242E-14%
Up Radiance: num1 = 1.67019298349406E-01, num2 = 1.67019298349406E-01, percent_difference = 6.64727391144080E-14%

The values that produced the largest percent difference:
Down Solar Radiance: num1 = 1.98443571457145E-11, num2 = 1.98443571457185E-11, percent_difference = 1.98973185487457E-11%

Overall these values are tiny, but I wanted to document these. The numerical issue " Floating-point exception - erroneous arithmetic operation." appears to be a "bug" in the float comparison routine, and likely related to underflow.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

numerical differences in gfortran vs. ifort and release vs. debug #154

numerical differences in gfortran vs. ifort and release vs. debug #154

BenjaminTJohnson commented Jul 26, 2024 •

edited

Loading

BenjaminTJohnson commented Jul 26, 2024

numerical differences in gfortran vs. ifort and release vs. debug #154

numerical differences in gfortran vs. ifort and release vs. debug #154

Comments

BenjaminTJohnson commented Jul 26, 2024 • edited Loading

BenjaminTJohnson commented Jul 26, 2024

BenjaminTJohnson commented Jul 26, 2024 •

edited

Loading