Skip to content

D. GEOS ADAS FSOI

rtodling edited this page Mar 10, 2021 · 21 revisions

Generating FSOI from GEOS ADAS

The GMAO implementation of a strategy to generate observation impacts combines the work of Langland and Baker (2004) and that of Tremolet (2008) through the so-called Forecast-based Sensitivity Observation Impact (FSOI) methodology; see also Gelaro et al. (2010) and Diniz and Todling (2019). Without getting into the technical aspects of the procedures, a rough understanding is necessary to know what needs to be available for getting GEOS ADAS producing this information. There are basically two main important components to FSOI beyond the regular model (AGCM) and analysis (GSI) pieces, namely, the adjoint for both these components. The measure of impact brought about by assimilating observations is associated with how sensitive the analysis component (GSI) is to minor changes the observations; changes in the observations induce changes in the analysis (IAU tendencies) and therefore involve sensitivities to the model (forecast) integration. For a variety of reasons (beyond the discussion here), it is the impact of observations on the 24-hour forecasts that are derived in GEOS ADAS (changing the lead time in the forecasts and consequently the lead time of impacts is trivial in GEOS DAS, and has been studied to some extent (e.g., Prive et al. (2020), though caution must be exercised in the validity of results beyond 24 hours).

Whenever talking about the sensitivity of the analysis or the sensitivity of the forecasting model, we are basically talking about the adjoint of the linearized operators corresponding to these entities. Sensitivity to changes in the initial conditions are automatically derived in the forecast step of some of the experiment templates introduced earlier. This is the case for the templates of both the x-experiments and prePP. We saw that in some of these cases, 33-hour and 123-hour forecasts are automatically issued. When the machinery of FSOI is being exercised, the 33-hour forecast is used to generate sensitivities to a 30-hour forecast and the 123-hour forecast is used to generate sensitivities to a 24-hour forecast both associated with the same verification time. The generated sensitivities are automatically placed in the directory named FVHOME/asens and they have names like the following

x0044.fsens_twe.eta.17760703_15z+17760705_00z-17760704_00z.nc4
x0044.fsens_twe.eta.17760703_21z+17760705_00z-17760704_00z.nc4

The three date/time tags in the filenames indicate the initial_date/time+final_fcst_date/time-analysis_date/time. In reference to the filenames above, they relate to forecasts started six-hours apart from each other, one at 1500 UTC and another at 2100 UTC on 3 July 1776; both forecasts end at 0000 UTC on 5 July 1776 - therefore, the first is a 33-hour forecast, and the second is a 27-hour forecast; both run the adjoint model (backwards in time) to generate 24-hour sensitivities valid at 0000 UTC on 4 July 1776. These two sensitivities are required by the Langland and Baker (2004) formulation to derive a quantitative assessment of the impact on the 24 hour forecast of assimilating observations at 0000 UTC on 4 July 1776. Notice that the control of how these sensitivities are produced in development-type experiments is within the g5fcst.j jobs. Therefore, the valid time of these sensitivities is the last date/time in the sequence of time tags; 0000 UTC on 4 July 1776 in the example above. Turning the generation of these forecasts sensitivities off is very easily accomplished by renaming the file FVHOME/fcst/initadj.rc (more on this below).

The final step in deriving observation impacts involves processing the forecast sensitivities through the adjoint of the analysis solver (GSI). In recent versions of the ADAS this step is also automatically launched, however there are instances in which it is good to know how to handle the analysis sensitivity runs by hand. The job script that is submitted for this step to complete is FVHOME/asens/g5asens.j. It is important to notice that only when a pair of forecast sensitivities is available can this script be submitted to the batch queue. A simple "sbatch g5asens.j" is enough to do the job, however, here too caution must be exercised. By default, when submitted this way, this script will continue to submit itself for as long as there are forecast sensitivity files available in the FVHOME/asens directory. A safer way to submit these jobs is by specifying explicitly the valid date/time of the forecast sensitivities to process. For example:

sbatch --export=this_nymdhh="17760704_00" g5asens.j

Just as seen in previous cases for offline forecasts and standalone analyses, here too, these analysis sensitivity jobs can be submitted in parallel. A simple script place in the FVHOME/asens directory

#!/bin/csh
set yyyymm = 177607
set hh = 00
foreach dd (`seq -f %02g 3 5`)
  sbatch --export=this_nymdhh="${yyyymm}${dd}_${hh}" g5asens.j
end

and executed at command line will do the job -- the case above submits three such jobs simultaneously. CAUTION: again must be exercised to not overtake the batch system, especially from users with access to high priority.

Additional Features and Capabilities

Norm Specification

A blurb on the mathematical formulation for the FSOI procedure implemented in GEOS ADAS is given here. The forecast verification measure evaluated in FSOI is defined through a quadratic quantity of the form:

J = < e’ C e >,

with e = xf – xv; xf the vector of forecast fields at given time; and xv the vector of verification fields valid at that same time. The procedure involves differentiation of this quantity which in GEOS ADAS performed by a program called initadj.x. This program requires the resource file initadj.rc mentioned previously. Its default setting correspond to defining the matrix C above as a linearized version of the moist total energy; alternative options are also available and indicated in the file itself. The program takes two vector fields xf and xv and generate the gradient of J, which is written out to a file whose template name is

%s.Jgradf_NORM.eta.%y4%m2%d2_%h2z+%y4%m2%d2_%h2z.nc4

where NORM is a specific three-character identifier of the selection made in initadj.rc, the default being twe, the moist energy form cited above.

Alternative Verification for FSOI Calculation

By default the verification (xv) used for constructing the J measure of FSOI is a self verification. In the usual ADAS settings, the file asm.acq under FVHOME/run controls this verification and points to the assimilation state found under,

/archive/u/USER/MYEXP/ana/Y%y4/M%2/MYEXP.asm.eta.%y4%m2%d2_%h2%n2z.nc4

assuming user USER runs experiment MYEXP, and output gets stored in the archive.

It is possible to replace the verification with that provided by another experiment (not self verifying), as for example, GEOS FP (say, f5127_fp). All that needs to be done in this case is for the user to:

  1. Edit asm.acq so that it looks like:

/home/dao_ops/f5271_fp/run/.../archive/ana/Y%y4/M%2/f5271_fp.asm.eta.%y4%m2%d2_%h2%n2z.nc4

and

  1. Edit FVHOME/fcst/g5fcst.j and add the following env variable:

setenv VEXPID f5271_fp

Another possibility is that a user might want to calculate FSOI with respect to the analysis rather than the assimilation fields. Recall from the presentation earlier that under IAU there are two states valid at say, a synoptic hour: the analysis and the assimilation. The first one is the state produced by simply adding the GSI increment to the original background field, the second corresponds to the output of the model integration within the IAU 6-hour period. By default, we verify against the assimilation, but changing to verify against the analysis requires simple the definition of the following env variable

setenv FCSTVERIFY ana

in the file FVHOME/fcst/g5fcst.j. This cases the scripts to look at the file ana.acq in FVHOME/run instead of asm.acq. Just as before, it is possible to verify against an alternative analysis (from a unrelated experiment) but in this case editing ana.acq, instead of asm.acq.

Users should know that there has been a lot of experimentation done comparing FSOI verified against the assimilation versus that verified against the analysis. Although the level of forecast errors, say the 24 and 30 hour errors change when the verification changes, the actual total impact - basically the difference between the 24 and 30 hours errors - does not change. The split of the impact into various observation classes also does not change in any significant way. This is not to say that verifying against an independent experiment does not change results; that is a different matter; results certainly change.

A few more specifics on the GEOS ADAS approach to FSOI

There are multiple ways to implement traditional FSOI. By traditional here it is meant implementations applied to variational data assimilation strategies such as that of the hybrid 4DEnVar employed in the GSI GEOS ADAS. A brief summary of the main approaches appears here. In short, GMAO follows a re-computation strategy which requires the analysis solver (GSI) to be re-used to minimize a cost similar to that used in the forward case but with a term replacement that amounts to redefining the right-hand-side of solver.

A faster solution to FSOI that specifically relies on the Lanczos vectors corresponding to the approximate Hessian inversion performed in the forward GSI is also available as an alternative method to calculate FSOI in GSI. However, this choice requires storing the Lanczos vectors of the forward minimization, for as many outer loops as employed in the minimization. As it stands, this amounts to a considerably large output from the forward GSI, as files would be written for each vector and each processor, for example, saving 75 vectors (total between two outer loops) in the present 672 PE MPI configuration of the analysis would require writing 50400 files and holding them until a verification is available to calculate the gradient of J and be able to complete the FSOI computation. Managing these many files at the present 25 km, 72 level, analysis resolution is found to be more cumbersome than simply re-running the solver; the expected increase in number of analysis levels will contribute to make these files twice as large and increase burden in the system.

Analysis Increment Sensitivity to Observations

Say, sensitivity on the analysis increment at 20130115_00z is to be calculated:

a) touch a file named 
      standalone.20130115_00z+20130115_00z-20130115_00z 
   under the FVHOME/asens directory 
b) as with regular analysis sensitivity runs, this calculation uses 
       gsi_sens.rc.tmpl as driver of adjoint-GSI 
    note: output ODS files will show up under Y2013/M01/D15/H00 with name 
          type imp1_inc 
c) if user wants to apply an initadj-like norm to the increment: 
       copy an existing initadj.rc to $FVHOME/asens/initadj4inc.rc and edit at will 
d) look in fvpsas.rc to properly set reference_eta_filename and 
       verifcation_eta_filename 
   entries. 
e) make sure ana.acq brings in your reference and verification states

Ensemble-based (model adjoint-free) FSOI

As seen above, the first requirement of implementing traditional FSOI is the availability of an adjoint model of the tangent linear model of the nonlinear forward atmospheric general circulation model. GMAO has had numerous versions of such from the original implementation of FSOI relying on the Giering et al. (2006) written for the regular latitude-longitude finite-volume hydrodynamics of Lin (2004), to its present cubed-sphere finite-volume hydrodynamics version (e.g., Holdaway et al. (2014)).

Hybrid ensemble-variational assimilation methodologies such that of GEOS ADAS rely on the availability of an ensemble to provide flow-dependent error covariance representation and improved background error characteristic to the minimization procedure. In such environment, Buehner et al (2018) develop a modification of the variational approach that allows for replacement of the adjoint of the forecasting model with an ensemble of forecasts that implicitly represent the adjoint propagation.

Preliminary results modifying GSI to allow for its adjoint of the hybrid version to incorporate the ensemble of forecasts and thus bypass the need for an explicit model adjoint appears in Todling et al. (2018).