Skip to content
Ricardo M Campos edited this page Oct 25, 2022 · 1 revision

Introduction

  The third-generation spectral wave model which is actively used for operational wave forecasts, WAVEWATCH III (WW3), is a five-dimensional numerical model covering: space (latitude, longitude), time, and spectrum (direction and frequency). Therefore, the quality of simulations can be evaluated through these dimensions as well as the error signal that can be tracked and sourced based on the model architecture. Additionally, when the model is integrated covering a sufficiently long period of time, and reliable observations are available, a proper validation is possible using error metrics, which can be expanded to analyze the performance in time-domain, probability-domain, and spectral-domain. By considering the dimensional structure of WW3 as well as the characteristics of the outputs through signal processing covering the domains aforementioned, the quality of wave simulations can be assessed. Note that spectral wave models’ extra dimensions require a different type of validation and visualization tools compared to other modeling systems. The existing tools employ conventional methods for statistical analysis, some lack proper skill to quantify spectral wave models. This is an opportunity to deeply investigate the skill and limitations of WW3 simulations using simple visualization, validation, and statistical tools.
  In this context, the repository presented here provide a variety of tools to download observations, visualize the wave data (ww3 results and buoys), perform a model validation, and plot statistical results. The descriptions in the header of each program and function as well as the examples provide practical guidance through the aforementioned steps.

Package structure

  This repository contains the WW3-tools python package, designed to facilitate the visualization and validation of WW3 results with easy-to-handle scripts and functions. The first part is dedicated to the visualization, where codes to plot wave fields and point-outputs (time-series and spectrum) are included. Examples of “wave panels” were also developed, and users can directly visualize in one figure the five dimensions of WW3 results and compare it with buoys – ideal for case studies. Among the wave panels, there are examples to compare two WW3 simulations, side-by-side, accompanied by buoy data – for a straightforward visual comparison.
  The second and most important part of the package is the validation, which is ready to read buoy data (NDBC and Copernicus) and satellite altimeter data for WW3 comparisons. The four steps for this task are: (1) pre-processing a grid mask generated with the grid characteristics of WW3 as well as water depth, distance to coast, and information of ocean names and NWS marine zones; (2) cyclone map, generated using IBTracks cyclone information, inserted into the lat/lon grid mask, which allows the option of evaluating the model for cyclonic and non-cyclonic conditions; (3) collocation of WW3 with observations, creating matchups model/buoy or model/satellite data, while also inserting the grid-mask and cyclone information. Step (3) creates a netcdf file with model and observations for the exact time and location, with a suitable data structure that makes easy to sub-sample the later assessments based on water depth (and/or distance to coast), ocean basin, forecast area of interest, and inside/outside the cyclones. The final step (4) contains a function to calculate summary statistics and percentiles of model and observations, another function to calculate error metrics, and several functions for validation plots, such as: QQ-plots, scatter plots, time-series of model and buoy, Taylor diagrams etc.
  Scripts to download publicly available observations (altimeters and buoys) and wave forecasts (NOAA GFS and GEFS) are also available so users can, for example, plot and validate the NOAA’s deterministic and ensemble forecasts for any location of interest, using reliable quality-controlled observations.
  In the first page of the WW3-tools repository users will see dependencies.py, containing all the python dependencies required to run the python codes in the directories. If you manage to successfully run it, then no dependency problems are expected whilst you work with the python scripts.
  It is important to mention that all WW3 variables as well as NDBC and Copernicus data hold the metric standard, so wave heights are in meters, periods in seconds etc.

$\textcolor{darkblue}{postproc}$

  This directory is for post-processing and visualization of WW3 simulations and buoy data.
wread.py is a key python script containing several functions to read WW3 data (field outputs, table of point outputs, and spectrum), NDBC and Copernicus buoy data (table of integrated parameters and spectrum). These functions are called by many codes in this repository so it is recommended that you make sure this is saved (or symbolic link, ln -s) in the directory you are running the visualization and validation scripts, or its path is exported. The header of wread.py shows all the details, inputs and outputs, and examples. More specific information can be found in the header of each function as well.
  The WW3 model provides three output types that are plotted by three python scripts, associated with: field outputs, time-series of point-output, and spectral point-output. These three python scripts are respectively ww3fields.py, ww3pointimeseries.py, ww3pointspec.py.
  The program ww3fields.py is designed to read both netcdf and grib2 formats, and there are options to customize the area and time steps to be plotted. The code is executed for a specific variable defined by the user as an input argument, so entering the file name and variable is mandatory. By using a structure with arguments, common to all the visualization codes, it is simple to embed it in another script, for example an operational shell script. The next codes, ww3pointimeseries.py and ww3pointspec.py are site-specific, associated with a point-output (single lat/lon). ww3pointimeseries.py plots time-series of integrated parameters for a specific point (input argument) while ww3pointspec.py reads the spectral output for a specific point (input argument) and plots the directional spectra, using a polar plot, and the power spectrum.
  The program ndbcpointspec.py is similar to ww3pointspec.py but applied to NDBC spectrum. It calls the function wread.spec_ndbc that reads and mounts the buoy’s directional wave spectrum, and then plot it. All polar spectra use a square-root based scale for the levels, colors, and colorbar.
  The results of the aforementioned programs are png figures, saved in the directory where the user is running the code. Since matplotlib.use('Agg') has been added to the scripts, no figure is expected to pop-up on the screen.

$\textcolor{darkblue}{download\ observations}$

  This is a directory with codes to help users to download observations from different sources. As initially described, WW3-tools joins altimeter and buoy data. While the python codes are able to work with any kind of measurements, taking into account the format and quality control it is much easier to manage the following databases.

NOAA’s National Data Buoy Center:
https://www.ndbc.noaa.gov/rsa.shtml
https://www.ndbc.noaa.gov/measdes.shtml
https://www.ndbc.noaa.gov/stndesc.shtml
Copernicus buoy data:
https://marine.copernicus.eu/
Integrated Marine Observing System (IMOS) Australian Ocean Data Network (AODN):
https://portal.aodn.org.au/
http://thredds.aodn.org.au/thredds/catalog/IMOS/SRS/Surface-Waves/Wave-Wind-Altimetry-DM00/catalog.html
https://doi.org/10.1038/s41597-019-0083-9

  The NDBC database has multiple options and formats of data, which can be easily downloaded with python scripts. The program get_ndbc_stdmet.py fetches the stdmet format and save the results in ascii. The program retrieve_ndbc_nc.py obtains the data and save it in netcdf format, which is more conveniently used during the validation steps. Both codes require a list of buoy stations to download; a suggestion list is included in the repository with name allbstations.dat. The NDBC spectral data can only be obtained using retrieve_ndbc_nc.py.
  For the Copernicus database, a small shell script was developed, get_buoydata_copernicus.sh, which downloads all the time-series of integrated wave parameters, GL_TS_ , as well as the spectral data, GL_WS_ . Users must register and obtain a user/password that needs to be added in the script.
  The AODN altimeter database is downloaded with get_AODN_AltData.sh, which fetches the data per satellite mission, being separated by hemisphere (N, S). These are input arguments to the code, the exact name of satellites can be found in the header of the code. Due to the large size of the whole data, this structure facilitates the download and data organization. Users will notice buoy data provides a lot of different parameters whereas altimeters only provide significant wave height (Hs) and wind speed (U10). Slightly out of scope, there is an additional program, get_AODN_ScatData.sh, for scatterometer satellite data if users want more wind information.

$\textcolor{darkblue}{validation}$

  Although the general validation principle looks intuitively simple, there are important scientific questions hidden in this process as well as critical complex steps. Above all, the validation must allow analysts to deeply investigate the model’s performance and its deficiencies, without being restricted to oversimplified results, such as for example associated with the RMSE of Hs alone. The challenge of having a complete validation package while preserving the practicability has been obtained through the structure shown in the next figures. The two figures present the validation steps using altimeter and buoy data, which are very similar. The validation scheme using altimeters has one more step in the process, related to the satellite collocation into a given regular grid, i.e, first the satellite data is collocated and then the matchups ww3/satellite are built. Whereas, for the buoy data, fixed in the same position (moored buoys), the matchups ww3/buoy is straightforward.
  The process starts with the download of observations, using the given codes described in the figure, or users can adapt to enter their own measurements. The second step is very important because it builds the foundation of where and how the evaluation will occur, through the lat/lon arrays, water depth, and distance to coast. Including marine zone and offshore zone names are optional. This information is carried through the whole validation process, being very useful at the last steps when users want to determine the errors at coastal zones, or, inversely, to ensure the validation is not affected by the continental shelf or islands, or even to specify the model performance at each ocean and offshore/marine areas. Step (3), associated with the cyclones, is also optional, but it is a great tool that allow users to separate the validation considering cyclonic and non-cyclonic conditions – which used to be a time demanding task that is now simple by using procyclmap.py and IBtracks data.
  The satellite collocation with gridSatGlobal_Altimeter.py involves some criteria for the weighted average using pyresample.kd_tree, associated with maximum distance and time to the centered grid point to select from the altimeter tracks. Default values, following the literature, are included; however, depending on the model domain and resolution, users should customize these criteria. The following step, with the model/observation collocation, generates an array with simple structure and dimension. For model/satellite, it is a two-dimensional array, where each column is one feature (model_hs, obs_hs, time, lat, lon, cycloneInfo etc) and the lines are the records of each matchup that has been reshaped. Therefore, if users want to subsample the validation for an specific time interval and location it can be easily done in python by selecting the indexes; for example ind=np.where(cyclone>2) to select model_hs[ind] and obs_hs[ind] for an analysis within the cyclones. The same is valid to select specific oceanic areas, marine/offshore zones, water depths intervals, latitude ranges etc. The main difference between buoy/ww3 matchups and altimeter/ww3 matchups, that must be considered at the final validation steps, is that the buoys/ww3 matchups have three dimensions instead of two, being this extra one associated with buoyID.
  Having the arrays of model and observations for the same time and location is the basis for the last part of the process, which includes two scripts with functions for the validation statistics and plots. These functions can be included in the main script, designed by the user, to finally obtain the summary statistics comparisons, error metrics, and plots. There is a WW3-tools directory with examples.

Screenshot

Figure 1 - Flowchart of the wave validation process using altimeter data.

Screenshot

Figure 2 - Flowchart of the wave validation process using buoy data.

  The validation package was designed using the methodology described in the following papers, which can be of great support for users to improve and to better interpret the assessments.
  Mentaschi L, Besio G, Cassola F, Mazzino, A., 2013. Problems in RMSE- based wave model validations. Ocean Model 72:53–58. https://doi.org/10.1016/j.ocemod.2013.08.003
  Mentaschi, L., Besio, G., Cassola, F., Mazzino, A., 2013. Developing and validating a forecast/hindcast system for the Mediterranean Sea. J. Coastal Res. SI 65, 1551– 1556. http:/dx.doi.org/102112/SI65-262.1.
  Jolliff, J.K., Kindle, J.C., Shulman, I., Penta, B., Friedrichs, M.A.M., Helber, R., Arnone, R.A., 2009. Summary diagrams for coupled hydrodynamic-ecosystem model skill assessment. Journal of Marine Systems, 76, 64–82. https://doi.org/10.1016/j.jmarsys.2008.05.014
  Campos, R.M., Alves, J.H.G.M., Penny, S.G., Krasnopolsky, V., 2020. Global assessments of the NCEP Ensemble Forecast System using altimeter data. Ocean Dynamics, 70, 405-419, https://doi.org/10.1007/s10236-019-01329-4
  Willmott C, Matsuura, K., 2005. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim Res 30(79–82).
  Hanna, S., Heinold, D. 1985. Development and application of a simple method for evaluating air quality. In: API Pub. No. 4409, Washington, DC, Washington, USA.

$\textcolor{darkblue}{operational\ forecast}$

  Similar to download_observations directory, this place contains programs and functions to download publicly available data, namely operational wave forecast data. Two options are included, for NOAA’s deterministic (GFS) and ensemble (GEFS) wave forecasts:
https://www.ftp.ncep.noaa.gov/data/nccf/com/gfs/prod
https://ftpprd.ncep.noaa.gov/data/nccf/com/gens/prod
  The operational forecast download and post-processing is restricted to linux environment and require the programs: wget, perl, CDO, NCO, and wgrib2. Inside lib/, users can edit the shell script util.sh to define minimum and maximum latitude and longitude to save the output netcdf files. Inside config/, users can edit compress.conf and renames.conf to customize the compression level/resolution and the final variable names saved in the netcdf. This part is very useful to avoid confusion, for example, of some centers calling significant wave height as swh while others use Hs. It ensures all the output netcdf files keep the same variable names.

$\textcolor{darkblue}{examples}$

  This final directory contains practical examples of visualization codes developed to evaluate and compare ww3 simulations. The wave panes included gathers many tools described above and can be used as a suggestion of visualization and model evaluation. The directory is under constant development, including small practical examples of ww3 simulations and validations.