Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can we encode a DSG trajectory has two identifiers? #282

Open
davidhassell opened this issue Mar 1, 2024 · 4 comments
Open

How can we encode a DSG trajectory has two identifiers? #282

davidhassell opened this issue Mar 1, 2024 · 4 comments
Labels
question Further information is requested or discussion invited

Comments

@davidhassell
Copy link
Collaborator

Hello,

I am creating a DSG trajectory dataset for meteorological research flight paths that each have two separate identifiers:

  1. The name of the route flown
  2. The unique ID of each individual flight

Multiple trajectory features can have the same route name (because the same route is flown on multiple occasions), but no two features have same the ID.

The ID seems to me like the best fit for the auxiliary coordinates withcf_role=trajectory_id, but what about the route name?

Can we store the route names in another auxiliary coordinate variable with standard name region? The conventions say that "When data is representative of geographic regions which can be identified by names [...]. We recommend that the names be chosen from the list of standardized region names whenever possible", which seems OK. However, the description of the region standard name says the contradictory "These strings are standardised. Values must be taken from the CF standard region list.". One of these is clearly wrong!

Any thoughts on this would be appreciated, many thanks,
David

@davidhassell davidhassell added the question Further information is requested or discussion invited label Mar 1, 2024
@taylor13
Copy link

taylor13 commented Mar 1, 2024

I think "region" should be reserved for a 2-dimensional geographical area and when two items are in the same location they should belong to the same region. If a friend and I talk walks in a park along different paths (which perhaps cross), I think we are both in the same "park region".

A standard name is not a requirement, so you could define an auxiliary coordinate without a standard name and give it the long_name "route" or "path", or some such. In CMIP6, we defined an ordinary coordinate called "site", which distinguished among about 200 CFMIP sampling locations scattered globally. This was a simple index coordinate which was not assigned a standard_name but with long_name="site index".

I agree that there is an inconsistency in the region description that needs to be cleaned up, but I don't think it should be used for "route" (unless no routes intersect and each is constrained to a single geographically-recognized region).

@larsbarring
Copy link

I agree with Karl in that using region in that ways seem to stretch the it a bit too far. As an alternative to using only long name; how about using trajectory_id for 1. The name of the route flown assuming that it is a limited set of routes (i.e. trajectories) that are flown many times, and then use the existing standard namerealization for 2. The unique ID of each individual flight. The description of realization reads

Realization is used to label a dimension that can be thought of as a statistical sample, e.g., labelling members of a model ensemble.

That is, for each route (trajectory) you have an ensemble of individual flights. Does this make sense?

@davidhassell
Copy link
Collaborator Author

Hi @taylor13 and @larsbarring,

Many thanks for your advice.

I think it clear that a standard name of region is not appropriate here.

I quite liked the idea of putting cf_role = trajectory_id on 1. (i.e. the route names auxiliary coordinate variable) since we could attach standardised attribute values to everything. However it may not be ideal, since the conventions say "The variable carrying the cf_role attribute ... must provide a unique identifier for each feature instance." (my emphasis), and the route names are not unique over the set of flights.

So, given that what I think I'll be going for is:

string route(n_flights) ;
    route.long_name = "Name of route for each flight (some routes are repeated)" ;
string flight(n_flights) ;
    flight.cf_role = "trajectory_id" ;
    flight.long_name = "Unique ID of each flight" ;

Does that look OK?
David

@taylor13
Copy link

taylor13 commented Mar 4, 2024

Not being wholly familiar with trajectory_id or cf_role, I shouldn't have the last word, but it makes sense to me given your above summary.
Karl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested or discussion invited
Projects
None yet
Development

No branches or pull requests

3 participants