-
Notifications
You must be signed in to change notification settings - Fork 150
Using EFSOI with GDAS
last edited July 26, 2022
The EFSOI code is merged with the GSI-utils develop branch and the scripts and j-jobs in the global-workflow develop branch, and most of the rest of the necessary files preliminary approval to be merged with the global-workflow development branch. As of this writing, EFSOI can be run on Orion using the development fork that has been merged with global-workflow develop from July 22, 2022 (ffcd5b), and the GSI-utils develop from July 19 (322cc7b).
To use EFSOI, you need to clone the forked global-workflow repository, then checkout the EFSOI branch, then run the usual sequence of scripts. This branch of global-workflow will clone a hash of a development fork of GSI-utils that contains the necessary fix file and some scripts for analyzing the EFSOI output. At the time of this writing EFSOI works on Orion and previously worked on WCOSS.
To set up the current global-workflow repository with the latest EFSOI development:
git clone --recursive https://github.com/AndrewEichmann-NOAA/global-workflow.git
cd global-workflow/
git checkout 92a4489
and then checkout, build, and link global-workflow as usual.
Run workflow/setup_expt.py
as usual for a cycling experiment. For testing 20-member ensembles suffices, and 80-member are used for experiments. GFS need not be run run separately.
In config.base
in your expdir, set
export DO_EFSOI="YES"
Also, per the global-workflow instructions for cycling experiments, set the following:
imp_physics from 8 (Thompson) to 11 (GFDL)
CCPP_SUITE to FV3_GFS_v16 (or another suite that uses GFDL)
Then run workflow/setup_xml.py
from the EFSOI build of global-workflow. This will set up the workflow with the extra EFSOI tasks. The experiment can be started as usual.
During the first complete cycle, the gdaseupdfsoi
task - the ensemble update with settings specific to EFSOI - will run with the same priority as gdaseupd
, leading to a parallel set of EFSOI-specific gdas tasks ending with post-processing. In the first complete cycle, the first 30-hour ensemble forecast (metatask gdasefmnfsoi
) is generated, and the gdasefsoi
task will never run. During the second complete cycle, the forecast will be made again for 30 hours, and post-processed to generate 24-hour and 30-hour ensemble means. The gdasefsoi
task in this cycle will sit idle until the cycle 24 hours subsequent is active and creates the verifying analysis. The gdasefsoi
task from the second complete cycle then runs, using the 24-hour forecast from that cycle and the 30-hour forecast from the previous cycle, creating the final observation sensitivity - osense - file. This is placed in the osense directory in COMROT. The process is repeated for the following cycles for the length of the experiment.
One result of this process is that the EFSOI-specific data, stored in efsoigdas
directories with a structure similar to that of enkfgdas
directories, has to be kept on disk for a longer time than the other data files, and can eat up space. Likewise the osense files are several hundred MB for each cycle. These osense files can be analyzed with scripts in sorc/gsi_utils.fd/src/EFSOI_Utilities/scripts
.
Ensemble Forecast Sensitivity to Observation Impacts is based on a method developed in Langland and Baker (2004) that uses a model adjoint and the Kalman gain to determine the positive or negative impact of individual assimilated observations on the error of a forecast relative to a verifying analysis. The state vector plot below illustrates the concept.
Plot by Rahul Mahajan
The forecast background Xb and analysis Xa of a given cycle are both used to initialize forecasts, Xaf and Xbf. These forecasts are then compared to a verifying analysis Xt to obtain the respective errors of the two forecasts. The difference in the errors at each observation point are traced back to their respective assimilated observations using the following equation:
Kalnay et al. (2012) developed a method to use ensemble forecasts and observation error covariance in lieu of an adjoint and Kalman gain:
Kalnay, E., Ota., Y., Miyoshi, T. and Liu, J. 2012. A simpler formulation of forecast sensitivity to observations: application to ensemble Kalman filters. Tellus, 64A, 18462
Ota., Y., Derber., J., Kalnay., E. and Miyoshi., T., 2013, Ensemble-Based Observation Impact Estimates Using the NCEP GFS. Tellus, 65A, 20038
In more concrete terms within GDAS and global-workflow, the variables in the EFSOI equation are represented as follows:
where the green terms are stored in the initial "osense file" generated during the EFSOI-specific ensemble update task (gdaseupdfsoi
) for a given cycle t0, and the values used for the forecast perturbation (the red term) are in 24-hour ensemble member forecasts initialized with the analysis at t0, also generated by gdaseupdfsoi
. The forecast errors (the blue terms) are calculated using the 24-hour forecast ensemble mean, the 30-hour forecast ensemble mean from the cycle t0-6hr (which is functionally the same as a 24-hour forecast initialized with the background at t0), and the verifying analysis from t0+24hr. Both the 24-hour and 30-hour ensemble forecasts are run at the same time for t0 with the metatask gdasefmnfsoi
, the 24-hour forecast to for the EFSOI calculation for cycle t and the 30-hour forecast for cycle t+6hr, and the ensemble means generated with gdasepmnfsoi
. Note that the 24-hour forecasts used are specific to the global model; regional models may use shorter forecasts for the same purpose.
The code to run EFSOI is spread out over three repositories: GSI-util for the EFSOI-exclusive Fortran code and Python scripts for analysis, GSI for libraries in enkf and gsi, and global-workflow for scripts to run within a cycling experiment.
Everything in this repository is under src/EFSOI_Utilities/
, with the Fortran in src/EFSOI_Utilities/src
and Python scripts in src/EFSOI_Utilities/scripts
. Under src/EFSOI_Utilities/fix
is a version of the file global_anavinfo.l127.txt
from the GSI fix
directory that is identical except for an entry for the EFSOI executable. At the time of this writing the location of this file is assigned to ANAVINFO in the config.efso in the experiment directory, though it should be merged with the regular fix file.
The files under src are as follows:
- efsoi.f90
- efsoi_main.f90
- gridio_efsoi.f90
- loadbal_efsoi.f90
- loc_advection.f90
- scatter_chunks_efsoi.f90
- statevec_efsoi.f90
The filenames ending in _efsoi.f90
were originally from similar files under the EnKF code in GSI as for various reason they could not be used as is as libraries. Otherwise effort has been made to reduced code duplication, and certain modules are linked from the EnKF and GSI code. As such the GSI-utils build needs to told the location as gsi_ROOT
and enkf_ROOT
, as described in the GSI-utils INSTALL.md
. This is done automatically in the global-workflow build, and that is probably the easiest context to do development here.
The EFSOI code links a number of modules as libraries in the GSI code, generally parameter setting, file I/O, MPI handling, and the like. Of particular interest is the source code in enkf_obs_sensitivity.f90
, which contains the code for reading and writing the osense file. This can be helpful for understanding the contents of the file, and modifying it if necessary. Apparently it is not otherwise used by the enkf executable, and so changes to it will not affect anything anything else. Bear in mind that the format has to match between writing and reading subroutines.
The global-workflow repository contains the configuration files and scripts necessary to run the tasks needed to complete the EFSOI algorithm. Each of the EFSOI-specific tasks (eupdfsoi
, ecenfsoi
, esfcfsoi
, efcsfsoi
, eposfsoi
, and efsoi
) has their own rocoto script:
jobs/rocoto/eupdfsoi.sh
jobs/rocoto/esfcfsoi.sh
jobs/rocoto/ecenfsoi.sh
jobs/rocoto/efcsfsoi.sh
jobs/rocoto/eposfsoi.sh
jobs/rocoto/efsoi.sh
...which in turn calls its respective j-job:
jobs/JGDAS_EFSOI_UPDATE
jobs/JGDAS_EFSOI_ECEN
jobs/JGDAS_EFSOI_SFC
jobs/JGDAS_EFSOI_FCST
jobs/JGDAS_EFSOI_POST
jobs/JGDAS_EFSOI
The tasks ecenfsoi
, esfcfsoi
, efcsfsoi
, and eposfsoi
, the j-jobs call the scripts used for the corresponding regular task, after setting variables specific to EFSOI tasks. The EFSOI tasks eupdfsoi
and efsoi
have their own run scripts:
scripts/exgdas_efsoi_update.sh
scripts/exgdas_efsoi.sh
The eupdfsoi
script scripts/exgdas_efsoi_update.sh
is fairly similar to scripts/exgdas_enkf_update.sh
, and scripts/exgdas_efsoi.sh
is a stripped down version of the same. There is also an EFSOI-specific block in ush/forecast_predet.sh
, downstream of the ensemble forecast script.
Each EFSOI-specific task also has its own config file in parm/config/
that gets copied into the experiment directory, as well as entries in parm/config/config.resources
and the machine-specific settings in the env
directory.
There are also blocks in jobs/rocoto/earc.sh
, config/config.earc
, and ush/hpssarch_gen.sh
to handle EFSOI-specific archiving.
Finally, there are sections the workflow setup scripts workflow/applications.py
, workflow/rocoto/workflow_tasks.py
, and /workflow/rocoto/workflow_xml.py
which handle setting up the EFSOI workflow if DO_EFSOI="YES"
is detected in config.base
when setting up an experiment.
The osense file output by both the EnKF executable during the ensemble update task, and EFSOI executable during the efsoi task. The update outputs the statistical information of each assimilated observation required to perform the EFSOI calculation, where the EFSOI executable reads it in, and then overwrites it with the same information plus the observation sensitivities.
The following tables document the contents of the osense file. The first is for a single header with variables common to all observations, and the second describing the record for each observation. The conventional and ozone observations are generally handled separately from the satellite observations, though the record format is the same. The variable names are as they are used in the EnKF/EFSOI code, and the same are used by convention in accompanying Python scripts
Type | Variable Name | Description |
---|---|---|
real(r_single) | obfit_prior | Observation fit to the first guess |
real(r_single) | obsprd_prior | Spread of observation prior |
real(r_single) | ensmean_obnobc | Ensemble mean first guess (no bias correction) |
real(r_single) | ensmean_ob | Ensemble mean first guess (bias corrected) |
real(r_single) | ob | Observation value |
real(r_single) | oberrvar | Observation error variance |
real(r_single) | lon | Longitude |
real(r_single) | lat | Latitude |
real(r_single) | pres | Pressure |
real(r_single) | time | Observation time |
real(r_single) | oberrvar_orig | Original error variance |
integer(i_kind) | stattype | Observation type |
character(len=20) | obtype | Observation element / Satellite name |
integer(i_kind) | indxsat | Satellite index (channel) set to zero |
real(r_single) | osense_kin | Observation sensitivity (kinetic energy) [J/kg] |
real(r_single) | osense_dry | Observation sensitivity (Dry total energy) [J/kg] |
real(r_single) | osense_moist | Observation sensitivity (Moist total energy) [J/kg] |