Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gsi fed #590

Merged
merged 10 commits into from
Sep 8, 2023
Merged

Gsi fed #590

merged 10 commits into from
Sep 8, 2023

Conversation

daviddowellNOAA
Copy link
Collaborator

@daviddowellNOAA daviddowellNOAA commented Jul 13, 2023

Description

Initialization of the operational RRFSv1 will include assimilation of flash-extent density (FED) observations from the GOES Geostationary Lightning Mapper (GLM). The current PR is the first of at least 3 that will be needed to introduce the capability of FED assimilation into the code and regional workflow. The new capabilities that are added to GSI are:

  • reading NetCDF FED observations
  • applying an observation operator that maps the model state to FED.

Much of the code was originally developed by Rong Kong at OU-CAPS (Kong et al. 2020, Wang et al. 2021, Kong et al. 2022; https://doi.org/10.1175/MWR-D-19-0192.1, https://doi.org/10.1175/MWR-D-20-0406.1, https://doi.org/10.1175/MWR-D-21-0326.1). Recently, the observation operator has been modified by Amanda Back and Ashley Sebok based on tests with regional, convection-allowing FV3 forecasts. The new observation operator includes a cap of 8 flashes / minute for both the observed and simulated FED.

The observation operator is specific to the 3-km regional FV3 application in RRFS. Development of a more general observation operator is left to future work.

Fixes #588

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Initial tests were with NOAA-EMC GSI-EnKF code obtained in April 2023 and modified to include the assimilation of FED observations. A prototype of RRFSv1 was cycled hourly for 2.5 days, and the EnKF assimilation included FED data assimilation.

For the current PR, only the GSI observer with FED (and radar reflectivity) observations was tested. It produces identical results to those obtained in April 2023.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

DUE DATE for this PR is 8/24/2023. If this PR is not merged into develop by this date, the PR will be closed and returned to the developer.

@ShunLiu-NOAA
Copy link
Contributor

@xyzemc and @guoqing-noaa , could you please review this PR?

! prgmmr: j guo <[email protected]>
! org: NASA/GSFC, Global Modeling and Assimilation Office, 610.3
! date: 2018-08-10
!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 3-8: Is Guo involved in the development of this gsi_fedOper.f90 file? If not, suggest removing these lines.

Copy link
Collaborator Author

@daviddowellNOAA daviddowellNOAA Aug 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not involved. I will update the documentation accordingly.


! headNode => obsLList_headNode(self%obsLL(ibin))
! call intjo(headNode, rval,sval)
! headNode => null()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need line 151-153? If not, suggest removing them

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't currently need this code. I agree that it is good to remove these lines.


! headNode => obsLList_headNode(self%obsLL(ibin))
! call stpjo(headNode,dval,xval,pbcjo(:),sges,nstep)
! headNode => null()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 180-182: Similar as above, suggest removing them if not needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

public:: fedNode

type,extends(obsNode):: fedNode
!type(td_ob_type),pointer :: llpoint => NULL()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 39: remove if not needed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I'm removing this commented out variable, plus several other similar commented out lines that were lingering from an earlier version of the code.

! procedure, nopass:: headerRead => obsHeader_read_
! procedure, nopass:: headerWrite => obsHeader_write_
! procedure:: init => obsNode_init_
! procedure:: clean => obsNode_clean_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 56, 58, 59, 68-71: remove if not needed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

character(len=*),parameter:: MYNAME="m_fedNode"

!#define CHECKSUM_VERBOSE
!#define DEBUG_TRACE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 84-85: remove if not needed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

type is(fedNode)
ptr_ => aNode
! class default
! call die(myname_,'unexpected type, aNode%mytype() =',aNode%mytype())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 101-102: remove if not needed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

if(.not.associated(aNode)) return
select type(aNode)
type is(fedNode)
ptr_ => aNode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not aligned

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As suggested, I improved the indentation.

!
!
DO i=1,numfed
DO k=1,maxlvl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should change 'DO' to lower case to be consistent with 'end do' at the end of loop?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.


write(6,*) ' ------- check max and min value of OBS: bufr fed -------'
write(6,*) ' level maxval(fed) minval(fed)'
DO k=1,maxlvl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, change the 'DO' to lower case 'do' ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

fed_max=-999.99
ndata2=0
DO i=1,numfed
DO k=1,maxlvl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, change the 'DO' to lower case 'do' for Line 395 & 396?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

real(r_kind),dimension(nobs) :: FEDMdiag,FEDMdiagTL
real(r_kind),dimension(nobs) :: FEDMdiag2D
integer(i_kind) :: npt
integer(i_kind) :: nobsfed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete the space before :: to align the line above

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

do ifld=2,nfldsig
call gsi_bundlegetpointer(gsi_metguess_bundle(ifld),trim(varname),rank3,istatus)
ges_q(:,:,:,ifld)=rank3
enddo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'enddo' -> 'end do'

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed "enddo" to "end do" in 13 locations.

else
write(6,*) trim(myname),': ', trim(varname), ' not found in met bundle, ier= ',istatus
call stop2(999)
endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'endif' -> 'end if'

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change "endif" to "end if" in 31 locations.

enddo
endif

end subroutine contents_binary_diag_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are too many 'endif' 'enddo' that I think is not consistent to GSI's format, please correct all of them in this subroutine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

endif

end subroutine contents_binary_diag_
subroutine contents_netcdf_diag_(odiag)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are too many 'endif' 'enddo' that I think is not consistent to GSI's format, please correct all of them in this subroutine too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if(nread > 0)then
if(first)then
open(iout_fed)
else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct the indent before 'else'

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@ShunLiu-NOAA
Copy link
Contributor

@daviddowellNOAA Could you please update this PR? The due date of this PR is 8/24/2023.

Copy link
Contributor

@guoqing-noaa guoqing-noaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@ShunLiu-NOAA
Copy link
Contributor

@xyzemc could you please review David's change?
@daviddowellNOAA could you start a regression test and post regression test result here?

@daviddowellNOAA
Copy link
Collaborator Author

@xyzemc and @guoqing-noaa Thanks for the reviews. I made all the changes you suggested.

Following the code changes, I compiled GSI on Jet and ran the Observer with conventional, radar-reflectivity, and FED observations. The GSI Observer produces identical results to what it did previously. That is the expected result.

@ShunLiu-NOAA The PR is now updated.

@daviddowellNOAA
Copy link
Collaborator Author

@ShunLiu-NOAA I don't know how to do a regression test, so I'll need some help.

@ShunLiu-NOAA
Copy link
Contributor

@daviddowellNOAA
Copy link
Collaborator Author

Thanks @ShunLiu-NOAA! I'll attempt a regression test today and report results.

@daviddowellNOAA
Copy link
Collaborator Author

Here are the results of the regression tests. The number of failures is discouraging. What's the next step in the process?

Start 1: [=[global_3dvar]=]
Start 2: [=[global_4dvar]=]

1/9 Test #1: [=[global_3dvar]=] ...............***Failed 120.45 sec
Start 3: [=[global_4denvar]=]
2/9 Test #3: [=[global_4denvar]=] .............***Failed 180.59 sec
Start 4: [=[hwrf_nmm_d2]=]
3/9 Test #4: [=[hwrf_nmm_d2]=] ................ Passed 788.70 sec
Start 5: [=[hwrf_nmm_d3]=]
4/9 Test #2: [=[global_4dvar]=] ............... Passed 1806.23 sec
Start 6: [=[rtma]=]
5/9 Test #5: [=[hwrf_nmm_d3]=] ................ Passed 737.78 sec
Start 7: [=[rrfs_3denvar_glbens]=]
6/9 Test #7: [=[rrfs_3denvar_glbens]=] ........ Passed 862.16 sec
Start 8: [=[netcdf_fv3_regional]=]
7/9 Test #8: [=[netcdf_fv3_regional]=] ........***Failed 366.28 sec
Start 9: [=[global_enkf]=]
8/9 Test #9: [=[global_enkf]=] ................***Failed 481.61 sec
9/9 Test #6: [=[rtma]=] ....................... Passed 2114.74 sec

56% tests passed, 4 tests failed out of 9

Total Test time (real) = 3921.11 sec

The following tests FAILED:
1 - [=[global_3dvar]=] (Failed)
3 - [=[global_4denvar]=] (Failed)
8 - [=[netcdf_fv3_regional]=] (Failed)
9 - [=[global_enkf]=] (Failed)
Errors while running CTest

@hu5970
Copy link
Collaborator

hu5970 commented Aug 22, 2023

@daviddowellNOAA Please point me the location of the your regression test cases. There were many reasons that could cause failure. Some of them may not be from the new code.

@daviddowellNOAA
Copy link
Collaborator Author

Thanks @hu5970. I ran the regression tests here on Hera: /scratch2/BMC/wrfruc/dowell/GSI

@hu5970
Copy link
Collaborator

hu5970 commented Aug 23, 2023

@daviddowellNOAA

For the first 2 cases
1 - [=[global_3dvar]=] (Failed)
3 - [=[global_4denvar]=] (Failed)

The reason of crash is because you do not have permission to read observation files:

permission to access file denied, unit 15, file /scratch2/BMC/wrfruc/dowell/GSI/tmpreg_global_3dvar/global_3dvar_loproc_updat/prepbufr_profl

So, I (or someone else) have to run those two regression test suites for you.

The "netcdf_fv3_regional" failed because of memory check. This is not a critical failure:
/scratch2/BMC/wrfruc/dowell/GSI/tmpreg_netcdf_fv3_regional/tmpreg_netcdf_fv3_regional/netcdf_fv3_regional_loproc_updat_vs_netcdf_fv3_regional_loproc_contrl/netcdf_fv3_regional_regression_results.txt

The global_enkf need more investigation. The control run failed because of the namelist:
/scratch2/BMC/wrfruc/dowell/GSI/tmpreg_global_enkf/global_enkf_loproc_contrl/stderr

Did you change any namelist option for this test?

Thanks,
Ming

@daviddowellNOAA
Copy link
Collaborator Author

@hu5970 Thanks for the investigation! For the global_enkf test, for some reason the regression_namelists.sh in my branch differed in two ways from the regression_namelists.sh in develop. When I make the namelists consistent, then my branch passes the global_enkf test.

Start 9: [=[global_enkf]=]

1/1 Test #9: [=[global_enkf]=] ................ Passed 2484.58 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) = 2484.60 sec

@guoqing-noaa
Copy link
Contributor

@hu5970 @daviddowellNOAA
Can we piggyback the bug fix on gsi/constants.f90?

--- a/gsi/constants.f90
+++ b/gsi/constants.f90
@@ -90,7 +90,7 @@ module constants
 
 ! Declare derived constants
   integer(i_kind):: huge_i_kind
-  integer(i_kind), parameter :: max_varname_length=20
+  integer(i_kind), parameter :: max_varname_length=60
   real(r_single):: tiny_single, huge_single

@daviddowellNOAA
Copy link
Collaborator Author

@guoqing-noaa Works for me.

We still have the issue of being unable to run the global_3dvar and global_4denvar regression tests owing to file permissions. What's our next step @hu5970 @ShunLiu-NOAA ?

@guoqing-noaa
Copy link
Contributor

@daviddowellNOAA You cannot access those files because you are not in the "rstprod" group (restricted data).
However, since these data are dated Nov 11, 2022, they should no longer be restricted now.

@guoqing-noaa
Copy link
Contributor

@daviddowellNOAA I can run the regression test for you. Let me know if there are any special things I need to pay attention to.

@daviddowellNOAA
Copy link
Collaborator Author

Thanks @guoqing-noaa . The only trick with my branch seems to be the regression/regression_namelists.sh file. Two changes are needed:

(1) Remove use_qsatensmean namelist parameter, i.e., change line 2172 from
nhr_anal=${IAUFHRS_ENKF},nhr_state=${IAUFHRS_ENKF},use_qsatensmean=.true.,
to
nhr_anal=${IAUFHRS_ENKF},nhr_state=${IAUFHRS_ENKF},

(2) Change pseudo_rh from .true. to .false. on line 2158.

@daviddowellNOAA
Copy link
Collaborator Author

Awesome. Thanks @guoqing-noaa . I'm taking notes so I'll know how to do this next time.

@guoqing-noaa
Copy link
Contributor

The regression test results are at: /scratch2/BMC/wrfruc/gge/GSI.ctests

    Start 1: [=[global_3dvar]=]          
1/9 Test #1: [=[global_3dvar]=] ...............   Passed  2714.60 sec                                                                                   
    Start 2: [=[global_4dvar]=]            
2/9 Test #2: [=[global_4dvar]=] ...............   Passed  2416.12 sec              
    Start 3: [=[global_4denvar]=]        
3/9 Test #3: [=[global_4denvar]=] .............   Passed  1746.57 sec                 
    Start 4: [=[hwrf_nmm_d2]=]             
4/9 Test #4: [=[hwrf_nmm_d2]=] ................   Passed  610.55 sec                  
    Start 5: [=[hwrf_nmm_d3]=]             
5/9 Test #5: [=[hwrf_nmm_d3]=] ................   Passed  556.27 sec                  
    Start 6: [=[rtma]=]                    
6/9 Test #6: [=[rtma]=] .......................   Passed  2134.00 sec                 
    Start 7: [=[rrfs_3denvar_glbens]=]     
7/9 Test #7: [=[rrfs_3denvar_glbens]=] ........   Passed  671.08 sec                  
    Start 8: [=[netcdf_fv3_regional]=]     
8/9 Test #8: [=[netcdf_fv3_regional]=] ........***Failed  486.82 sec                  
    Start 9: [=[global_enkf]=]             
9/9 Test #9: [=[global_enkf]=] ................   Passed  743.24 sec                  

89% tests passed, 1 tests failed out of 9  

Total Test time (real) = 12079.36 sec 

For the failed netcdf_fv3_regional case, it is because it failed the scalability test.

The runtime for netcdf_fv3_regional_loproc_updat is 78.436049 seconds and is within the maximum allowable operational time of 1200 seconds,                                 
continuing with regression test.           

The runtime for netcdf_fv3_regional_loproc_updat is 78.436049 seconds and is within the allowable threshold time of 100.704785 seconds,                                     
continuing with regression test.           

The runtime for netcdf_fv3_regional_hiproc_updat is 70.789409 seconds and is within the allowable threshold time of 84.600691 seconds,                                      
continuing with regression test.           

The memory for netcdf_fv3_regional_loproc_updat is 173836 KBs and is within the maximum allowable hardware memory limit of 2516582 KBs,                                     
continuing with regression test.           

The memory for netcdf_fv3_regional_loproc_updat is 173836 KBs and is within the maximum allowable memory of 191589 KBs,                                                     
continuing with regression test.           

The results (penalty) between the two runs (netcdf_fv3_regional_loproc_updat and netcdf_fv3_regional_loproc_contrl) are reproducible.                                       

The results (penalty) between the two runs (netcdf_fv3_regional_loproc_updat and netcdf_fv3_regional_hiproc_updat) are reproducible                                         

The case has Failed the scalability test.  
The slope for the update (9.558300 seconds per node) is less than that for the control (12.883275 seconds per node).   

@daviddowellNOAA
Copy link
Collaborator Author

Thanks @guoqing-noaa!

Here's what I got for the netcdf_fv3_regional regression test with the latest code:

The runtime for netcdf_fv3_regional_loproc_updat is 31.956025 seconds and is within the maximum allowable operational time of 1200 seconds,
continuing with regression test.

The runtime for netcdf_fv3_regional_loproc_updat is 31.956025 seconds and is within the allowable threshold time of 38.683795 seconds,
continuing with regression test.

The runtime for netcdf_fv3_regional_hiproc_updat is 30.413437 seconds and is within the allowable threshold time of 36.823723 seconds,
continuing with regression test.

The memory for netcdf_fv3_regional_loproc_updat is 35247860 KBs. This has exceeded maximum allowable hardware memory limit of 2516582 KBs,
resulting in Failure maxmem of the regression test.

The memory for netcdf_fv3_regional_loproc_updat is 35247860 KBs and is within the maximum allowable memory of 38772839 KBs,
continuing with regression test.

The results (penalty) between the two runs (netcdf_fv3_regional_loproc_updat and netcdf_fv3_regional_loproc_contrl) are reproducible.

The results (penalty) between the two runs (netcdf_fv3_regional_loproc_updat and netcdf_fv3_regional_hiproc_updat) are reproducible

The case has passed the scalability regression test.
The slope for the update (1.928235 seconds per node) is greater than or equal to that for the control (1.488057 seconds per node).

It seems results for this regression test are not reproducible.

@daviddowellNOAA
Copy link
Collaborator Author

I also tested the code, after the sync with the latest develop, for the GSI observer test case (with conventional, radar-reflectivity, and FED observations) on Jet. It continues to produce identical results to previous tests.

@guoqing-noaa
Copy link
Contributor

guoqing-noaa commented Aug 24, 2023

@daviddowellNOAA You are right. The netcdf_fv3_regional test may depend on the HPC load balance. I reran this case and this time it passed! (I did not make any changes)

ctest -R netcdf_fv3 

    Start 8: [=[netcdf_fv3_regional]=]     
1/1 Test #8: [=[netcdf_fv3_regional]=] ........   Passed  2948.80 sec              

100% tests passed, 0 tests failed out of 1     

Total Test time (real) = 2948.82 sec     

@ShunLiu-NOAA I would think this PR is ready to be merged. Thanks!

ShunLiu-NOAA
ShunLiu-NOAA previously approved these changes Aug 25, 2023
@hu5970
Copy link
Collaborator

hu5970 commented Aug 27, 2023

I did regression test for this PR on WCOSS2:

[ming.hu@clogin01 build] ctest -j9
Test project /lfs/h2/emc/ptmp/Ming.Hu/test/GSI/build
    Start 1: global_3dvar
    Start 2: global_4dvar
    Start 3: global_4denvar
    Start 4: hwrf_nmm_d2
    Start 5: hwrf_nmm_d3
    Start 6: rtma
    Start 7: rrfs_3denvar_glbens
    Start 8: netcdf_fv3_regional
    Start 9: global_enkf
1/9 Test #8: netcdf_fv3_regional ..............   Passed  484.98 sec
2/9 Test #5: hwrf_nmm_d3 ......................   Passed  495.88 sec
3/9 Test #9: global_enkf ......................   Passed  614.50 sec
4/9 Test #4: hwrf_nmm_d2 ......................   Passed  666.96 sec
5/9 Test #7: rrfs_3denvar_glbens ..............   Passed  724.93 sec
6/9 Test #6: rtma .............................***Failed  1226.90 sec
7/9 Test #3: global_4denvar ...................   Passed  1443.18 sec
8/9 Test #1: global_3dvar .....................   Passed  1564.93 sec
9/9 Test #2: global_4dvar .....................   Passed  1689.94 sec

89% tests passed, 1 tests failed out of 9

Total Test time (real) = 1689.96 sec

The following tests FAILED:
	  6 - rtma (Failed)
Errors while running CTest

The reason for RTMA failure is:
The case has Failed the scalability test.
This is not a critical failure.

@RussTreadon-NOAA
Copy link
Contributor

Revert to the careful work @MichaelLueken did as GSI code manager. Built GSI_FED and develop with BUILD_TYPE=Debug in ush/build.sh.

Debug compilation flagged numerous warning and remark in *_fed source code files

warning

/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(142): warning #8889: Explicit declaration of the EXTERNAL attribute is required.   [STOP2]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(177): warning #8889: Explicit declaration of the EXTERNAL attribute is required.   [GETCOUNT_BUFR]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(185): warning #8889: Explicit declaration of the EXTERNAL attribute is required.   [OPENBF]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(187): warning #8889: Explicit declaration of the EXTERNAL attribute is required.   [DXDUMP]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(188): warning #8889: Explicit declaration of the EXTERNAL attribute is required.   [DATELEN]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(199): warning #8889: Explicit declaration of the EXTERNAL attribute is required.   [STOP2]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(205): warning #8889: Explicit declaration of the EXTERNAL attribute is required.   [STOP2]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(209): warning #8889: Explicit declaration of the EXTERNAL attribute is required.   [UFBINT]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(234): warning #8889: Explicit declaration of the EXTERNAL attribute is required.   [UFBINT]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(283): warning #8889: Explicit declaration of the EXTERNAL attribute is required.   [CLOSBF]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(343): warning #8889: Explicit declaration of the EXTERNAL attribute is required.   [W3FS21]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(503): warning #8889: Explicit declaration of the EXTERNAL attribute is required.   [COUNT_OBS]

remark

/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(32): remark #7712: This variable has not been used.   [RAD2DEG]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(33): remark #7712: This variable has not been used.   [CGROSS]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(33): remark #7712: This variable has not been used.   [CERMAX]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(33): remark #7712: This variable has not been used.   [CERMIN]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(33): remark #7712: This variable has not been used.   [CVAR_B]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(33): remark #7712: This variable has not been used.   [CVAR_PG]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(34): remark #7712: This variable has not been used.   [NCMITER]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(34): remark #7712: This variable has not been used.   [NCGROUP]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(34): remark #7712: This variable has not been used.   [NCNUMGRP]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(34): remark #7712: This variable has not been used.   [ICTYPE]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(34): remark #7712: This variable has not been used.   [ICSUBTYPE]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(39): remark #7712: This variable has not been used.   [TIME_WINDOW]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(70): remark #7712: This variable has not been used.   [MAXFED]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(100): remark #7712: This variable has not been used.   [NUMOBSA]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(122): remark #7712: This variable has not been used.   [IDATE5]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(122): remark #7712: This variable has not been used.   [MINS_OB]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/read_fed.f90(124): remark #7712: This variable has not been used.   [IDATE5S]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/gsi_fedOper.F90(150): remark #7712: This variable has not been used.   [SELF]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/gsi_fedOper.F90(130): remark #7712: This variable has not been used.   [SELF]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/gsi_fedOper.F90(66): remark #7712: This variable has not been used.   [LAST_PASS]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/gsi_fedOper.F90(150): remark #7712: This variable has not been used.   [IBIN]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/gsi_fedOper.F90(130): remark #7712: This variable has not been used.   [IBIN]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/gsi_fedOper.F90(130): remark #7712: This variable has not been used.   [RVAL]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/gsi_fedOper.F90(130): remark #7712: This variable has not been used.   [SVAL]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/gsi_fedOper.F90(170): remark #7712: This variable has not been used.   [HEADNODE]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/gsi_fedOper.F90(146): remark #7712: This variable has not been used.   [HEADNODE]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/gsi_fedOper.F90(150): remark #7712: This variable has not been used.   [DVAL]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/gsi_fedOper.F90(150): remark #7712: This variable has not been used.   [XVAL]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/gsi_fedOper.F90(150): remark #7712: This variable has not been used.   [PBCJO]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/gsi_fedOper.F90(150): remark #7712: This variable has not been used.   [SGES]
/lfs/h1/emc/da/noscrub/russ.treadon/git/gsi/pr590/src/gsi/gsi_fedOper.F90(150): remark #7712: This variable has not been used.   [NSTEP]

GSI Coding Standards require removal of unused variables in newly added source code

Unused variables must be removed from the code. Unused variables trigger compiler warnings. NCO requires codes to compile without compiler warnings.

The debug builds of develop and GSI_FED generate identical netcdf_fv3_regional results when using my WCOSS2 account.

Test project /lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/pr590/build
    Start 8: netcdf_fv3_regional
1/1 Test #8: netcdf_fv3_regional ..............   Passed  962.03 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) = 962.04 sec

@RussTreadon-NOAA
Copy link
Contributor

Looks like many of the variable has not been used remarks for gsi_fedOper.F90 are due to the fact that the intjo and stpjo in this file are not complete. This is consistent with remarks in issue #588. I wonder if this is why I see differences in the minimization with radiances? Of course, this does not explain why things work for @hu5970.

@hu5970
Copy link
Collaborator

hu5970 commented Aug 30, 2023

I will work with David to clean those unused variables and make a clean branch from current head of the develop branch.

@RussTreadon-NOAA
Copy link
Contributor

I will work with David to clean those unused variables and make a clean branch from current head of the develop branch.

Thank you @hu5970 . Not all the unused variables can be removed at present. Some are unused because they are placeholders for future development.

@RussTreadon-NOAA
Copy link
Contributor

Debug tests
Compile develop and GSI_FED with BUILD_TYPE=Debug. Run ctests with following results

russ.treadon@alogin01:/lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/pr590/build> ctest -j 9
Test project /lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/pr590/build
    Start 1: global_3dvar
    Start 2: global_4dvar
    Start 3: global_4denvar
    Start 4: hwrf_nmm_d2
    Start 5: hwrf_nmm_d3
    Start 6: rtma
    Start 7: rrfs_3denvar_glbens
    Start 8: netcdf_fv3_regional
    Start 9: global_enkf
1/9 Test #9: global_enkf ......................***Failed   60.17 sec
2/9 Test #3: global_4denvar ...................***Failed  120.15 sec
3/9 Test #1: global_3dvar .....................***Failed  120.16 sec
4/9 Test #5: hwrf_nmm_d3 ......................***Failed  120.15 sec
5/9 Test #8: netcdf_fv3_regional ..............   Passed  962.08 sec
6/9 Test #7: rrfs_3denvar_glbens ..............   Passed  1083.85 sec
7/9 Test #4: hwrf_nmm_d2 ......................   Passed  2585.09 sec
8/9 Test #2: global_4dvar .....................   Passed  4561.93 sec

The debug gsi.x and enkf.x generated the following errors for the indicated ctests
global_enkf

forrtl: severe (408): fort: (2): Subscript #1 of the array LONSGRD has value 18433 which is greater than the upper bound of 18432

Image              PC                Routine            Line        Source
enkf.x             00000000015ACE2F  Unknown               Unknown  Unknown
enkf.x             0000000000972B6A  observer_enkf_mp_          98  observer_gfs.f90
enkf.x             0000000000610286  readconvobs_mp_ge         761  readconvobs.f90

global_4denvar

forrtl: severe (408): fort: (3): Subscript #1 of the array TEMPERATURE has value 0 which is less than the lower bound of 1

Image              PC                Routine            Line        Source
gsi.x              0000000007A5B8EF  Unknown               Unknown  Unknown
gsi.x              0000000004F2B45F  read_iasi_                830  read_iasi.f90
gsi.x              00000000019E3AD6  read_obsmod_mp_re        1742  read_obs.F90

global_3dvar

forrtl: severe (408): fort: (3): Subscript #1 of the array TEMPERATURE has value 0 which is less than the lower bound of 1

Image              PC                Routine            Line        Source
gsi.x              0000000007A5B8EF  Unknown               Unknown  Unknown
gsi.x              0000000004F2B45F  read_iasi_                830  read_iasi.f90
gsi.x              00000000019E3AD6  read_obsmod_mp_re        1742  read_obs.F90

hwrf_nmm_d3

forrtl: severe (408): fort: (7): Attempt to use pointer RV_OZ when it is not associated with a target

Image              PC                Routine            Line        Source
gsi.x              0000000007A5B8EF  Unknown               Unknown  Unknown
gsi.x              00000000038C1440  ensctl2state_ad_          211  ensctl2state_ad.f90
libiomp5.so        000014C395C563F3  __kmp_invoke_micr     Unknown  Unknown
libiomp5.so        000014C395BDA937  __kmp_fork_call       Unknown  Unknown
libiomp5.so        000014C395B9E533  __kmpc_fork_call      Unknown  Unknown
gsi.x              00000000038BBB23  ensctl2state_ad_          191  ensctl2state_ad.f90
gsi.x              00000000047DF95E  pcgsoimodpcgsoi_m        1004  pcgsoi.f90

rtma
The rtma_loproc_contrl test aborted with a NFS (file system error). The loproc and hiproc updat jobs aborted with the following error

forrtl: error (63): output conversion error, unit 218, file /lfs/h1/emc/ptmp/russ.treadon/pr590/tmpreg_rtma/rtma_h\
iproc_updat/fort.218
Image              PC                Routine            Line        Source
gsi.x              0000000007A59AFB  Unknown               Unknown  Unknown
gsi.x              0000000007AB6217  Unknown               Unknown  Unknown
gsi.x              000000000255878B  statsconv_                548  statsconv.f90
gsi.x              00000000023E93F4  setuprhsall_              626  setuprhsall.f90

It is likely that additional errors exist in the code. One would need to modify the code to address the above errors, recompile, and rerun. This cycle would likely need to be repeated several times before all tests ran to completion in debug mode.

The errors above are not unique to GSI_FED. They exist in develop. As such correction of these errors is outside the scope of this PR.

Absent a GSI code manager is it unlikely the above and other errors caught by debug compilations will be fixed any time soon.

@RussTreadon-NOAA
Copy link
Contributor

WCOSS2 (Dogwood) ctests

Manually merge authoritative develop at be4a3d9 into daviddowellNOAA:GSI_FED. Build develop and GSI_FED. Run ctests with the following results

russ.treadon@dlogin04:/lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/pr590/build> ctest -j 9
Test project /lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/pr590/build
    Start 8: netcdf_fv3_regional
    Start 1: global_3dvar
    Start 2: global_4dvar
    Start 3: global_4denvar
    Start 4: hwrf_nmm_d2
    Start 5: hwrf_nmm_d3
    Start 6: rtma
    Start 7: rrfs_3denvar_glbens
    Start 9: global_enkf
1/9 Test #8: netcdf_fv3_regional ..............***Failed  482.51 sec
2/9 Test #5: hwrf_nmm_d3 ......................   Passed  492.44 sec
3/9 Test #7: rrfs_3denvar_glbens ..............   Passed  545.98 sec
4/9 Test #4: hwrf_nmm_d2 ......................   Passed  606.20 sec
5/9 Test #9: global_enkf ......................   Passed  608.51 sec
6/9 Test #6: rtma .............................   Passed  1088.34 sec
7/9 Test #3: global_4denvar ...................   Passed  1322.28 sec
8/9 Test #1: global_3dvar .....................   Passed  1442.34 sec
9/9 Test #2: global_4dvar .....................   Passed  1502.14 sec

89% tests passed, 1 tests failed out of 9

Total Test time (real) = 1502.22 sec

The following tests FAILED:
          8 - netcdf_fv3_regional (Failed)
Errors while running CTest

The netcdf_fv3_regional is due to

The case has Failed the scalability test.
The slope for the update (1.026056 seconds per node) is less than that for the control (1.846044 seconds per node).

A check of the wall times does not reveal any anomalous behavior.

netcdf_fv3_regional_hiproc_contrl/stdout:The total amount of wall time                        = 64.182251
netcdf_fv3_regional_hiproc_updat/stdout:The total amount of wall time                        = 64.312029
netcdf_fv3_regional_loproc_contrl/stdout:The total amount of wall time                        = 66.028295
netcdf_fv3_regional_loproc_updat/stdout:The total amount of wall time                        = 65.132874

This is not a fatal fail.

The Dogwood ctests results are consistent with @hu5970 Cactus results. That is, the changes in GSI_FED do not alter ctest results. This is expected since none of the ctests use FED data. Furthermore, this PR is only the first step of adding FED to the GSI. Even if FED data were present, it would not be assimilated.

@hu5970
Copy link
Collaborator

hu5970 commented Sep 1, 2023

@RussTreadon-NOAA Thanks for testing PR on Dogwood. I will work with David to clean the code based on compiling information from using "DEBUG". After the clean, I will ask Guoqing and Xiaoyan to review the code again. If they do not have problem and the regression tests passed again. I will ask for merging the code.

write(6,*) 'fed_highbnd=',fed_highbnd
write(6,*) 'maxval(ges_qg)=',maxval(ges_qg),'pe=',mype
! write(6,*) 'maxval(geop_hgtl)=',maxval(geop_hgtl(:,:,:,it))
write(6,*) 'maxval(ges_tsen)=',maxval(ges_tsen(:,:,:,it))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@daviddowellNOAA @hu5970 Please delete or commented out the below two lines:
write(6,) 'maxval(ges_tsen)=',maxval(ges_tsen(:,:,:,it))
write(6,
) 'ges_prsi',ges_prsi(100,100,1,1),ges_prsi(100,100,nsig,1)
The ges_tsen will crash GSI in the debug mode. Thanks.

@daviddowellNOAA
Copy link
Collaborator Author

daviddowellNOAA commented Sep 7, 2023

Here are my comments copied from pull request #1 from @hu5970, which is now merged into the current PR.

Thanks @hu5970. The proposed changes look good. The modifications to read_fed.f90 remove unused variables and also add a helpful check for valid lat-lon values. The changes to setupfed.f90 remove unused variables and also remove lines that were used for testing purposes but aren't actually needed for the new functionality.

I compiled GSI with these changes and ran a test case of the GSI observer that includes FED observations. The results are identical to those from runs of the test case with GSI before the current changes in this PR.

@hu5970 hu5970 self-assigned this Sep 7, 2023
@RussTreadon-NOAA RussTreadon-NOAA mentioned this pull request Sep 7, 2023
6 tasks
@hu5970
Copy link
Collaborator

hu5970 commented Sep 7, 2023

The regression test on WCOSS2:

[ming.hu@dlogin08 build] ctest -j9

Test project /lfs/h2/emc/ptmp/Ming.Hu/pr590/GSI/build
    Start 1: global_3dvar
    Start 2: global_4dvar
    Start 3: global_4denvar
    Start 4: hwrf_nmm_d2
    Start 5: hwrf_nmm_d3
    Start 6: rtma
    Start 7: rrfs_3denvar_glbens
    Start 8: netcdf_fv3_regional
    Start 9: global_enkf

1/9 Test #8: netcdf_fv3_regional ..............   Passed  723.04 sec
2/9 Test #9: global_enkf ......................   Passed  792.56 sec
3/9 Test #7: rrfs_3denvar_glbens ..............   Passed  845.08 sec
4/9 Test #5: hwrf_nmm_d3 ......................***Failed  1332.43 sec
5/9 Test #4: hwrf_nmm_d2 ......................   Passed  1386.61 sec
6/9 Test #6: rtma .............................   Passed  1628.75 sec
7/9 Test #2: global_4dvar .....................   Passed  1682.29 sec
8/9 Test #3: global_4denvar ...................   Passed  1742.37 sec
9/9 Test #1: global_3dvar .....................   Passed  2103.07 sec

89% tests passed, 1 tests failed out of 9

Total Test time (real) = 2103.14 sec

The following tests FAILED:
          5 - hwrf_nmm_d3 (Failed)

The reason of "hwrf_nmm_d3" failure is "The case has Failed the scalability test". It is not a fatal failure.

@hu5970
Copy link
Collaborator

hu5970 commented Sep 7, 2023

The regression test results on Hera:

Test project /scratch1/BMC/wrfruc/mhu/gsi/pr590/GSI/build
    Start 1: global_3dvar
    Start 2: global_4dvar
    Start 3: global_4denvar
    Start 4: hwrf_nmm_d2
    Start 5: hwrf_nmm_d3
    Start 6: rtma
    Start 7: rrfs_3denvar_glbens
    Start 8: netcdf_fv3_regional
    Start 9: global_enkf
1/9 Test #8: netcdf_fv3_regional ..............   Passed  548.61 sec
2/9 Test #5: hwrf_nmm_d3 ......................***Failed  562.91 sec
3/9 Test #4: hwrf_nmm_d2 ......................***Failed  610.21 sec
4/9 Test #7: rrfs_3denvar_glbens ..............***Failed  614.57 sec
5/9 Test #9: global_enkf ......................   Passed  1033.00 sec
6/9 Test #6: rtma .............................   Passed  1211.55 sec
7/9 Test #3: global_4denvar ...................   Passed  1673.48 sec
8/9 Test #2: global_4dvar .....................   Passed  1683.89 sec
9/9 Test #1: global_3dvar .....................   Passed  1850.22 sec

67% tests passed, 3 tests failed out of 9

Total Test time (real) = 1850.22 sec

The following tests FAILED:
	  4 - hwrf_nmm_d2 (Failed)
	  5 - hwrf_nmm_d3 (Failed)
	  7 - rrfs_3denvar_glbens (Failed)
Errors while running CTest

The test results are on: /scratch1/BMC/wrfruc/mhu/gsi/pr590
Checking the results files from "hwrf_nmm_d3", "hwrf_nmm_d2", "rrfs_3denvar_glbens" cases show that all three only failed with "The case has Failed the scalability test". This is not fatal failure.

@hu5970 hu5970 merged commit d7ac706 into NOAA-EMC:develop Sep 8, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adding GLM Flash Extent Density to GSI Observer
7 participants