Skip to content

A. Setting up GEOS ADAS experiments

rtodling edited this page Mar 5, 2021 · 11 revisions

Setting up full GEOS ADAS experiments

The easiest why to set up an ADAS experiment is to use an existing experiment template. Such templates are available here. In the initial release of the ADAS v5.27.1_p3 the following experiment templates are available:

  • C48f.input - C48 3DVAR
  • C90C.input - C90 3DVAR
  • C90C_replay.input - C90 hybrid 4DEnVar, with ensemble replayed to the from FP
  • C90C_ens.input - C90 hybrid 4DEnVar
  • x0043.input - C360 hybrid 4DEnVar; template for actual official x0043 x-test experiment
  • x0044.input - C360 hybrid 4DEnVar; template for actual official x0044 x-test experiment
  • prePP.input - full resolution, FP-like, Hybrid 4DEnVar

The user should copy a template of interest, make minor editing to adjust userID and location of experiment home (referred to as FVHOME), and then use the program runjob to process the template, e.g.,

FULLPATH/GEOSadas/install/bin/runjob -l path_of_user_adjusted_template_.input

where FULLPATH is the remainder of the path to the GEOSadas bin directory.

The existing experiment templates can be adjusted for setting experiments for a time period other than those set in the template, and the observing system can also easily be edited to accommodate users need.

The more experienced users can run the base GEOSadas script, namely fvsetup, by hand.

Typically, runjob will prompt the user asking whether to submit the main job to the batch system. In case the user chooses to postpone batch submission, this can be done at a later time by

cd FVHOME/run sbatch g5das.j

Note that in the particular case of x-experiment templates, the main job scripts are named according to the experiment number, e.g., x0043 as x43.j, for its main job (i.e., replace g5das.j above with x43.j).

Brief Overview of a GEOS ADAS Experiment Home Directory

The experiment home directory, referred to as FVHOME, is where key controlling scripts, resource files, and restarts live for a given ADAS experiment. This directory should be place in one of discover's nobackup disks, during the setup setup above, since it typically holds large files as the run gets going.

The main subdirectories under FVHOME are:

  • run - controls ADAS cycling (corrector-predictor steps)
  • fcst - controls forecast issued from cycled restart (initial conditions)
  • anasa - controls running of standalone atmospheric analysis (GSI)
  • asens - controls running of FSOI (forecast-based sensitivities observation impact)

Running Model Forecasts within the ADAS Context

As mentioned in the introductory section here, in development mode, forecasts from cycled analyses and restarts are performed after the fact; they are not performed as automatic extensions to the predictor step of IAU (as done in FP-like settings). However, in many of the experiment template examples provided previously, forecasts are launched automatically by the main ADAS driving scripts at 1500 and 2100 UTC; the corresponding 1800 and 0000 UTC analyses. The actual forecast script, g5fcst.j, lives under FVHOME/fcst.

Users can also launch forecasts on their own by following steps similar to

cd FVHOME/fcst
touch forecast.19970704_09z
sbatch g5fcst.j

The length of the forecast is controlled by the CAP.rc.tmpl file under the FVHOME/fcst directory. By default, the example above will retrieve initial conditions from the underlying (user) experiment related to FVHOME. In these cases, the IAU tendency files are also retrieved as part of the initial conditions, this is because the forecast framework in ADAS integrates from the beginning of the IAU period, rather than another time (see below). Specifically, in the example case in the introduction when the ADAS cycle covers the period from 0900 to 2100 UTC, forecasts from a 1200 UTC analysis can be issued in three possible ways:

  1. Starting at the end of the background period, i.e., 2100 UTC
  2. Starting at the end of the IAU corrector step, i.e., 1500 UTC
  3. Starting at the initial time of the cycle, i.e., 0900 UTC

The quickest and most economic way is (1) above since it avoids redundancy in redoing parts of the cycle; it simply picks up from where the ADAS cycling left off. However, the defaults in the ADAS are such that initial conditions for this case (i.e., the final state of the model integration at the end of the IAU predictor step) are never available. During corrector-predictor cycle, a complete set of restarts is only written at the end of the corrector step, i.e., at 1500 UTC in the example above. Launching forecasts from this time, option (2) above, would simply involve the basic model initial conditions and a setup of the model control file (AGCM.rc.tmpl) to let the model run free from the influence of the analysis, all such influence are retained in the restarts. This introduces some redundancy since it repeats the predictor part of the integration (already performed during the ADAS cycle). As oddly as it might seem, for practical (and historical) reasons related to model output management, option (3) is that embedded in the default settings. This option reruns the first 12 hours of the ADAS cycle -- and indeed a test of consistency in the regression tests makes sure it is indeed the case that outputs from the first 12 hours of the forecast agrees with corresponding ones produced during the cycle. A schematic representation of the equivalence between the ADAS cycle and parts a corresponding forecasts is found here

In the x-like experimental settings the length of the forecasts are controlled by files CAP_15.rc.tmpl and CAP_21.rc.tmpl: the former is set to run forecasts from 1500 UTC initial conditions for 33 hours; the latter controls forecasts from 2100 UTC initial conditions and is set to 123 hours (5 day and 3 hours). The initial time of these forecasts and their corresponding length is associated with machinery for deriving observation impacts through the Forecast-based Sensitivity Observation Impact (FSOI; e.g., Diniz and Todling 2019, and references therein); more on this in subsequent sections.

It is simple to turn off the automatic generation of forecasts while the ADAS cycles. This is controlled by the environment variable FCSTIMES in the main ADAS job scripts (e.g., g5das.j). Commenting out the settings for this environment variable shuts off any automation in the launching of forecasts. This is sometimes useful when the batch queues are too full or the user is mainly interested in the what happens in the cycle itself. As we have seen above, forecasts can always be issued after the fact; offline.

Assuming that initial conditions are available, that is, that the underlying ADAS has cycled through the period of interest, it is also possible to run multiple forecasts simultaneously. For example, the little script below can be placed in the FVHOME/fcst directory

#!/bin/csh
set hh = 21
set yyyy = 1776
set mm = 07
foreach dd ( `seq -f %02g  1 6` )
    set this_nymdhh = ${yyyy}${mm}${dd}_${hh}
    touch forecast.${this_nymdhh}z
    sbatch --export=this_nymdhh="${this_nymdhh}" g5fcst.j
  end
end

and executed at command line. In this example, seven forecast jobs are submitted to the batch queue for the 0000 UTC analyses from 1-7 July 1776. The user should know that in this mode (i.e., when a date and time is passed at command line to the g5fcst.j script), no automatic archiving of the output from the forecasts will take place; all output will be left under FVHOME/prog.
CAUTION: Depending on the level of priority queue these parallel forecasts can overtake the batch system, please be considered to others and run only a handful of these at a time.

Running Standalone Atmospheric Analysis

In many cases it might be desirable to run the GEOS ADAS analysis in standalone mode. This is useful for developers of the analysis system when needing to run multiple tests to adjust a given implementation; many time these tests are done by the user running the same case over and over again. If the user has setup an ADAS experiment following the instruction above, the required script to run a standalone analysis can be found under:

FVHOME/anasa/g5anasa.j

If the user has cycled the ADAS for a few cycles, re-running any of the existing analysis can be accomplished easily by following these steps;

  • cd FVHOME/anasa
  • touch standalone.17760704_00z
  • sbatch g5anasa.j

If the user wants to point to a case, or cases, other than the self experiment the following files need to be edited and adjusted to point to the proper experiment:

  • FVHOME/run/bkg.acq
  • FVHOME/run/satbias.acq
  • FVHOME/anasa/atmens_replay.acq (in case of hybrid analysis)

For example if the user experiment is named myexp and it is expected to use backgrounds from the official x-experiment x0043, the user would need to edit the files above and make changes so as,

original in bkg.acq:

/archive/u/user/myexp/ana/Y%y4/M%m2/myexp.bkg.eta.%y4%m2%d2_%h2%n2z.nc4

modified:

/discover/nobackup/projects/gmao/dadev/dao_it/archive/x0043/ana/Y%y4/M%m2/x0043.bkg.eta.%y4%m2%d2_%h2%n2z.nc4 => myexp.bkg.eta.%y4%m2%d2_%h2%n2z.nc4

and similarly for all the remaining active (non-commented) lines in these acq files.

Once g5anasa.j has been submitted it will fetch the required backgrounds, observations and, in case of hybrid experiments, the corresponding ensemble of background fields forming the ensemble component of the background error covariance and it will archive the output with all files give "sa" identifiers to be distinguishable from files generated during the regular ADAS cycle.

The standalone analysis script can also be submitted by the date and time of the case-study explicit in the command line. This is done by:

  • touch standalone.yyyymmdd_hhz
  • sbatch --export=this_nymdhh=yyyymmdd_hh g5anasa.j

where yyyymmdd_hh is convention for 4-digit year, 2-digit month, 2-digit day, and 2-digit hour, e.g., 17760704_12z; hour here must be either 00, 06, 12, or 18. With this capability, it is also possible to run multiple standalone analyses, for different analysis intervals, simultaneously. Writing a little scrips might be useful to handle such cases, e.g.,

#!/bin/csh 
set hh = 00 
set yyyy = 1776 
set mm = 07 
foreach dd ( `seq -f %02g  4 5` ) 
  set this_nymdhh = ${yyyy}${mm}${dd}_${hh} 
  touch standalone.${this_nymdhh}z 
  sbatch --export=this_nymdhh="${this_nymdhh}" g5anasa.j 
end 

In the example above, the script runs 2 instances of the analysis one for 17760704_00 and another for 17760705_00.
CAUTION: these parallel runs can take over the batch queue, especially if the user has high priority access, so be kind to others and do not submit more than about a few of these at a time (five or six).

Note that when the submission command line with the option this_nymdhh is used, the output of the standalone experiment will not be archived; the output will transfer from tmpdir to subdirectories of FVHOME.

Generating FSOI from GEOS ADAS

The GMAO implementation of a strategy to generate observation impacts combines the work of Langland and Baker (2004) and that of Tremolet (2008) through the so-called Forecast-based Sensitivity Observation Impact (FSOI) methodology; see also Gelaro et al. (2010) and Diniz and Todling (2019). Without getting into the technical aspects of the procedures, a rough understanding is necessary to know what needs to be available for getting GEOS ADAS producing this information. There are basically two main important components to FSOI beyond the regular model (AGCM) and analysis (GSI) pieces, namely, the adjoint for both these components. The measure of impact brought about by assimilating observations is associated with how sensitive the analysis component (GSI) is to minor changes the observations; changes in the observations induce changes in the analysis (IAU tendencies) and therefore involve sensitivities to the model (forecast) integration. For a variety of reasons (beyond the discussion here), it is the impact of observations on the 24-hour forecasts that are derived in GEOS ADAS (changing the lead time in the forecasts and consequently the lead time of impacts is trivial in GEOS DAS, and has been studied to some extent (e.g., Prive et al. (2020), though caution must be exercised in the validity of results beyond 24 hours).

Whenever talking about the sensitivity of the analysis or the sensitivity of the forecasting model, we are basically talking about the adjoint of the linearized operators corresponding to these entities. Sensitivity to changes in the initial conditions are automatically derived in the forecast step of some of the experiment templates introduced earlier. This is the case for the templates of both the x-experiments and prePP. We saw that in some of these cases, 33-hour and 123-hour forecasts are automatically issued. When the machinery of FSOI is being exercised, the 33-hour forecast is used to generate sensitivities to a 30-hour forecast and the 123-hour forecast is used to generate sensitivities to a 24-hour forecast both associated with the same verification time. The generated sensitivities are automatically placed in the directory named FVHOME/asens and they have names like the following

x0044.fsens_twe.eta.17760703_15z+17760705_00z-17760704_00z.nc4
x0044.fsens_twe.eta.17760703_21z+17760705_00z-17760704_00z.nc4

The three date/time tags in the filenames indicate the initial_date/time+final_fcst_date/time-analysis_date/time. In reference to the filenames above, they relate to forecasts started six-hours apart from each other, one at 1500 UTC and another at 2100 UTC on 3 July 1776; both forecasts end at 0000 UTC on 5 July 1776 - therefore, the first is a 33-hour forecast, and the second is a 27-hour forecast; both run the adjoint model (backwards in time) to generate 24-hour sensitivities valid at 0000 UTC on 4 July 1776. These two sensitivities are required by the Langland and Baker (2004) formulation to derive a quantitative assessment of the impact on the 24 hour forecast of assimilating observations at 0000 UTC on 4 July 1776. Notice that the control of how these sensitivities are produced in development-type experiments is all within the g5fcst.j jobs. Therefore, the valid time of these sensitivities is the last date/time in the sequence of time tags; 0000 UTC on 4 July 1776 in the example above. Turning the generation of these forecasts sensitivities off is very easily accomplished by renaming the file FVHOME/fcst/initadj.rc.

The final step in deriving observation impacts involves processing the forecast sensitivities through the adjoint of the analysis solver (GSI). This step is not automatically launched by the forecast or ADAS scripts, but is rather left to the user to do by hand (this might change in upcoming releases). The job script that must be submitted for this step to complete is FVHOME/asens/g5asens.j. It is important to notice that only when a pair of forecast sensitivities is available can this script be submitted to the batch queue. A simple "sbatch g5asens.j" is enough to do the job, however, here too caution must be exercised. By default, when submitted this way, this script will continue to submit itself for as long as there are forecast sensitivity files available in the FVHOME/asens directory. A safer way to submit these jobs is by specifying explicitly the valid date/time of the forecast sensitivities to process. For example:

sbatch --export=this_nymdhh="17760704_00" g5asens.j

Just as seen in previous cases for offline forecasts and standalone analyses, here too, these analysis sensitivity jobs can be submitted in parallel. A simple script place in the FVHOME/asens directory

#!/bin/csh
set yyyymm = 177607
set hh = 00
foreach dd (`seq -f %02g 3 5`)
  sbatch --export=this_nymdhh="${yyyymm}${dd}_${hh}" g5asens.j
end

and executed at command line will do the job -- the case above submits three such jobs simultaneously. CAUTION: again must be exercised to not overtake the batch system, especially from users with access to high priority.

Alternative Verification for FSOI Calculation

# WARNING: working on this section - 

By default the verification used for FSOI is given by results from the experiment running FSOI, i.e., self verification. Recall that FSOI involves defining an aspect of the forecast to be evaluated. This is typically represented as a quadratic quantity:

J= < e’ C e >,

with e = xf – xv; xf the vector of forecast fields at given time; and xv the vector of verification fields valid at that same time. By default the file asm.acq under FVHOME/run controls the verification and points to self, that is, assuming user USER runs experiment MYEXP, and output gets stored in the archive, asm.acq looks like:

/archive/u/USER/MYEXP/ana/Y%y4/M%2/MYEXP.asm.eta.%y4%m2%d2_%h2%n2z.nc4

Suppose the users wants to, instead, verify against FP (e.g., f5127_fp). All that needs to be done is for the user to:

  1. Edit asm.acq so that it looks like:

/home/dao_ops/f5271_fp/run/.../archive/ana/Y%y4/M%2/f5271_fp.asm.eta.%y4%m2%d2_%h2%n2z.nc4

and

  1. Edit FVHOME/fcst/g5fcst.j and add the following env variable:

setenv VEXPID f5271_fp

Another possibility is that a user might want to calculate FSOI with respect to the analysis rather than the assimilation fields. Recall from the presentation earlier that under IAU there are two states valid at say, a synoptic hour: the analysis and the assimilation. The first one is the state produced by simply adding the GSI increment to the original background field, the second corresponds to the output of the model integration within the IAU 6-hour period. By default, we verify against the assimilation, but changing to verify against the analysis requires simple the definition of the following env variable

setenv FCSTVERIFY ana

in the file FVHOME/fcst/g5fcst.j. This cases the scripts to look at the file ana.acq in FVHOME/run instead of asm.acq. Just as before, it is possible to verify against an alternative analysis (from a unrelated experiment) but in this case editing ana.acq, instead of asm.acq.

Users should know that there has been a lot of experimentation done comparing FSOI verified against the assimilation versus that verified against the analysis. Although the level of forecast errors, say the 24 and 30 hour errors change when the verification changes, the actual total impact - basically the difference between the 24 and 30 hours errors - does not change. The split of the impact into various observation classes also does not change in any significant way. This is not to say that verifying against an independent experiment does not change results; that is a different matter; results certainly change.

Additional Features and Capabilities

Analysis Increment Sensitivity to Observations

Say, sensitivity on the analysis increment at 20130115_00z is to be calculated:

a) touch a file named 
      standalone.20130115_00z+20130115_00z-20130115_00z 
   under the FVHOME/asens directory 
b) as with regular analysis sensitivity runs, this calculation uses 
       gsi_sens.rc.tmpl as driver of adjoint-GSI 
    note: output ODS files will show up under Y2013/M01/D15/H00 with name 
          type imp1_inc 
c) if user wants to apply an initadj-like norm to the increment: 
       copy an existing initadj.rc to $FVHOME/asens/initadj4inc.rc and edit at will 
d) look in fvpsas.rc to properly set reference_eta_filename and 
       verifcation_eta_filename 
   entries. 
e) make sure ana.acq brings in your reference and verification states