Skip to content

Latest commit

 

History

History
283 lines (212 loc) · 16.9 KB

ch03.adoc

File metadata and controls

283 lines (212 loc) · 16.9 KB

Description of the Data

The attributes described in this section are used to provide a description of the content and the units of measurement for each variable. We continue to support the use of the units and long_name attributes as defined in COARDS. We extend COARDS by adding the optional standard_name attribute which is used to provide unique identifiers for variables. This is important for data exchange since one cannot necessarily identify a particular variable based on the name assigned to it by the institution that provided the data.

The standard_name attribute can be used to identify variables that contain coordinate data. But since it is an optional attribute, applications that implement these standards must continue to be able to identify coordinate types based on the COARDS conventions.

Units

The units attribute is required for all variables that represent dimensional quantities (except for boundary variables defined in [cell-boundaries] and climatology variables defined in [climatological-statistics] ). The value of the units attribute is a string that can be recognized by UNIDATA’s Udunits package [UDUNITS], with a few exceptions that are given below. The Udunits package includes a file udunits.dat, which lists its supported unit names. Note that case is significant in the units strings.

The COARDS convention prohibits the unit degrees altogether, but this unit is not forbidden by the CF convention because it may in fact be appropriate for a variable containing, say, solar zenith angle. The unit degrees is also allowed on coordinate variables such as the latitude and longitude coordinates of a transformed grid. In this case the coordinate values are not true latitudes and longitudes which must always be identified using the more specific forms of degrees as described in [latitude-coordinate] and [longitude-coordinate].

Units are not required for dimensionless quantities. A variable with no units attribute is assumed to be dimensionless. However, a units attribute specifying a dimensionless unit may optionally be included. The Udunits package defines a few dimensionless units, such as percent, but is lacking commonly used units such as ppm (parts per million). This convention does not support the addition of new dimensionless units that are not udunits compatible. The conforming unit for quantities that represent fractions, or parts of a whole, is "1". The conforming unit for parts per million is "1e-6". Descriptive information about dimensionless quantities, such as sea-ice concentration, cloud fraction, probability, etc., should be given in the long_name or standard_name attributes (see below) rather than the units .

The units level, layer, and sigma_level are allowed for dimensionless vertical coordinates to maintain backwards compatibility with COARDS. These units are not compatible with Udunits and are deprecated by this standard because conventions for more precisely identifying dimensionless vertical coordinates are introduced (see [dimensionless-vertical-coordinate]).

The Udunits syntax that allows scale factors and offsets to be applied to a unit is not supported by this standard. The application of any scale factors or offsets to data should be indicated by the scale_factor and add_offset attributes. Use of these attributes for data packing, which is their most important application, is discussed in detail in [packed-data].

Udunits recognizes the following prefixes and their abbreviations.

Table 3.1. Supported Units
Factor Prefix Abbreviation Factor Prefix Abbreviation

1e1

deca,deka

da

1e-1

deci

d

1e2

hecto

h

1e-2

centi

c

1e3

kilo

k

1e-3

milli

m

1e6

mega

M

1e-6

micro

u

1e9

giga

G

1e-9

nano

n

1e12

tera

T

1e-12

pico

p

1e15

peta

P

1e-15

femto

f

1e18

exa

E

1e-18

atto

a

1e21

zetta

Z

1e-21

zepto

z

1e24

yotta

Y

1e-24

yocto

y

Long Name

The long_name attribute is defined by the NUG to contain a long descriptive name which may, for example, be used for labeling plots. For backwards compatibility with COARDS this attribute is optional. But it is highly recommended that either this or the standard_name attribute defined in the next section be provided to make the file self-describing. If a variable has no long_name attribute then an application may use, as a default, the standard_name if it exists, or the variable name itself.

Standard Name

A fundamental requirement for exchange of scientific data is the ability to describe precisely the physical quantities being represented. To some extent this is the role of the long_name attribute as defined in the NUG. However, usage of long_name is completely ad-hoc. For some applications it would be desirable to have a more definitive description of the quantity, which would allow users of data from different sources to determine whether quantities were in fact comparable. For this reason an optional mechanism for uniquely associating each variable with a standard name is provided.

A standard name is associated with a variable via the attribute standard_name which takes a string value comprised of a standard name optionally followed by one or more blanks and a standard name modifier (a string value from [standard-name-modifiers]).

The set of permissible standard names is contained in the standard name table. The table entry for each standard name contains the following:

standard name

The name used to identify the physical quantity. A standard name contains no whitespace and is case sensitive.

canonical units

Representative units of the physical quantity. Unless it is dimensionless, a variable with a standard_name attribute must have units which are physically equivalent (not necessarily identical) to the canonical units, possibly modified by an operation specified by either the standard name modifier (see below and [standard-name-modifiers]) or by the cell_methods attribute (see [cell-methods] and [appendix-cell-methods]).

description

The description is meant to clarify the qualifiers of the fundamental quantities such as which surface a quantity is defined on or what the flux sign conventions are. We don"t attempt to provide precise definitions of fundumental physical quantities (e.g., temperature) which may be found in the literature.

When appropriate, the table entry also contains the corresponding GRIB parameter code(s) (from ECMWF and NCEP) and AMIP identifiers.

The standard name table is located at http://cfconventions.org/Data/cf-standard-names/current/src/cf-standard-name-table.xml, written in compliance with the XML format, as described in [standard-name-table-format]. Knowledge of the XML format is only necessary for application writers who plan to directly access the table. A formatted text version of the table is provided at http://cfconventions.org/Data/cf-standard-names/current/build/cf-standard-name-table.html, and this table may be consulted in order to find the standard name that should be assigned to a variable. Some standard names (e.g. region and area_type) are used to indicate quantities which are permitted to take only certain standard values. This is indicated in the definition of the quantity in the standard name table, accompanied by a list or a link to a list of the permitted values.

Standard names by themselves are not always sufficient to describe a quantity. For example, a variable may contain data to which spatial or temporal operations have been applied. Or the data may represent an uncertainty in the measurement of a quantity. These quantity attributes are expressed as modifiers of the standard name. Modifications due to common statistical operations are expressed via the cell_methods attribute (see [cell-methods] and [appendix-cell-methods]). Other types of quantity modifiers are expressed using the optional modifier part of the standard_name attribute. The permissible values of these modifiers are given in [standard-name-modifiers].

Example 3.1. Use of standard_name
float psl(lat,lon) ;
  psl:long_name = "mean sea level pressure" ;
  psl:units = "hPa" ;
  psl:standard_name = "air_pressure_at_sea_level" ;

The description in the standard name table entry for air_pressure_at_sea_level clarifies that "sea level" refers to the mean sea level, which is close to the geoid in sea areas.

Here are lists of equivalences between the CF standard names and the standard names from the ECMWF GRIB tables, the NCEP GRIB tables, and the PCMDI tables.

Ancillary Data

When one data variable provides metadata about the individual values of another data variable it may be desirable to express this association by providing a link between the variables. For example, instrument data may have associated measures of uncertainty. The attribute ancillary_variables is used to express these types of relationships. It is a string attribute whose value is a blank separated list of variable names. The nature of the relationship between variables associated via ancillary_variables must be determined by other attributes. The variables listed by the ancillary_variables attribute will often have the standard name of the variable which points to them including a modifier ([standard-name-modifiers]) to indicate the relationship.

Example 3.2. Instrument data
  float q(time) ;
    q:standard_name = "specific_humidity" ;
    q:units = "g/g" ;
    q:ancillary_variables = "q_error_limit q_detection_limit" ;
  float q_error_limit(time)
    q_error_limit:standard_name = "specific_humidity standard_error" ;
    q_error_limit:units = "g/g" ;
  float q_detection_limit(time)
    q_detection_limit:standard_name = "specific_humidity detection_minimum" ;
    q_detection_limit:units = "g/g" ;

Flags

The attributes flag_values, flag_masks and flag_meanings are intended to make variables that contain flag values self describing. Status codes and Boolean (binary) condition flags may be expressed with different combinations of flag_values and flag_masks attribute definitions.

The flag_values and flag_meanings attributes describe a status flag consisting of mutually exclusive coded values. The flag_values attribute is the same type as the variable to which it is attached, and contains a list of the possible flag values. The flag_meanings attribute is a string whose value is a blank separated list of descriptive words or phrases, one for each flag value. Each word or phrase should consist of characters from the alphanumeric set and the following five: '_', '-', '.', '+', '@'. If multi-word phrases are used to describe the flag values, then the words within a phrase should be connected with underscores. The following example illustrates the use of flag values to express a speed quality with an enumerated status code.

Example 3.3. A flag variable, using flag_values
  byte current_speed_qc(time, depth, lat, lon) ;
    current_speed_qc:long_name = "Current Speed Quality" ;
    current_speed_qc:standard_name = "sea_water_speed status_flag" ;
    current_speed_qc:_FillValue = -128b ;
    current_speed_qc:valid_range = 0b, 2b ;
    current_speed_qc:flag_values = 0b, 1b, 2b ;
    current_speed_qc:flag_meanings = "quality_good sensor_nonfunctional
                                      outside_valid_range" ;

The flag_masks and flag_meanings attributes describe a number of independent Boolean conditions using bit field notation by setting unique bits in each flag_masks value. The flag_masks attribute is the same type as the variable to which it is attached, and contains a list of values matching unique bit fields. The flag_meanings attribute is defined as above, one for each flag_masks value. A flagged condition is identified by performing a bitwise AND of the variable value and each flag_masks value; a non-zero result indicates a true condition. Thus, any or all of the flagged conditions may be true, depending on the variable bit settings. The following example illustrates the use of flag_masks to express six sensor status conditions.

Example 3.4. A flag variable, using flag_masks
  byte sensor_status_qc(time, depth, lat, lon) ;
    sensor_status_qc:long_name = "Sensor Status" ;
    sensor_status_qc:_FillValue = 0b ;
    sensor_status_qc:valid_range = 1b, 63b ;
    sensor_status_qc:flag_masks = 1b, 2b, 4b, 8b, 16b, 32b ;
    sensor_status_qc:flag_meanings = "low_battery processor_fault
                                      memory_fault disk_fault
                                      software_fault
                                      maintenance_required" ;

The flag_masks, flag_values and flag_meanings attributes, used together, describe a blend of independent Boolean conditions and enumerated status codes. The flag_masks and flag_values attributes are both the same type as the variable to which they are attached. A flagged condition is identified by a bitwise AND of the variable value and each flag_masks value; a result that matches the flag_values value indicates a true condition. Repeated flag_masks define a bit field mask that identifies a number of status conditions with different flag_values. The flag_meanings attribute is defined as above, one for each flag_masks bit field and flag_values definition. Each flag_values and flag_masks value must coincide with a flag_meanings value. The following example illustrates the use of flag_masks and flag_values to express two sensor status conditions and one enumerated status code.

Example 3.5. A flag variable, using flag_masks and flag_values
  byte sensor_status_qc(time, depth, lat, lon) ;
    sensor_status_qc:long_name = "Sensor Status" ;
    sensor_status_qc:_FillValue = 0b ;
    sensor_status_qc:valid_range = 1b, 15b ;
    sensor_status_qc:flag_masks = 1b, 2b, 12b, 12b, 12b ;
    sensor_status_qc:flag_values = 1b, 2b, 4b, 8b, 12b ;
    sensor_status_qc:flag_meanings =
         "low_battery
          hardware_fault
          offline_mode calibration_mode maintenance_mode" ;

In this case, mutually exclusive values are blended with Boolean values to maximize use of the available bits in a flag value. The table below represents the four binary digits (bits) expressed by the sensor_status_qc variable in the previous example.

Bit 0 and Bit 1 are Boolean values indicating a low battery condition and a hardware fault, respectively. The next two bits (Bit 2 and Bit 3) express an enumeration indicating abnormal sensor operating modes. Thus, if Bit 0 is set, the battery is low and if Bit 1 is set, there is a hardware fault - independent of the current sensor operating mode.

Table 3.2. Flag Variable Bits (from Example)
Bit 3 (MSB) Bit 2 Bit 1 Bit 0 (LSB)

H/W Fault

Low Batt

The remaining bits (Bit 2 and Bit 3) are decoded as follows:

Table 3.3. Flag Variable Bit 2 and Bit 3 (from Example)
Bit 3 Bit 2 Mode

0

1

offline_mode

1

0

calibration_mode

1

1

maintenance_mode

The "12b" flag mask is repeated in the sensor_status_qc flag_masks definition to explicitly declare the recommended bit field masks to repeatedly AND with the variable value while searching for matching enumerated values. An application determines if any of the conditions declared in the flag_meanings list are true by simply iterating through each of the flag_masks and AND’ing them with the variable. When a result is equal to the corresponding flag_values element, that condition is true. The repeated flag_masks enable a simple mechanism for clients to detect all possible conditions.