-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coordinates: Origin/Offset, GeoTransform, OGC Domain set #17
Comments
aouch, I hadn't seen that CF "coordinate subsampling" method yet (always amazed by the universe of possibilites offered by the CF spec...). This is super complex. This is a kind of subsampling geolocation array, but with several subsampling/interpolation sub-areas. GeoTIFF is much much simpler that that (and far less capable). The basic model of GeoTIFF is to have a geotransformation matrix with 6 terms:
to express that [Xgeoref Ygeoref 1] = M * [column row 1 ] |
This is helpful. I've not read the CF1.10 8.3 closely yet -- I wonder if you could actually use some simple subset of what's possible there to support the "deduce the values of the M matrix" is something that I think we need to get away from. I know it's possible and done in various softwares, but I don't think it is "right". An "as simple as possible" convention for this could be: (this is just a strawman)
|
yes, this is indeed fragile, especially when file creators decide to use Float32 instead of Float64, in which case you need to add fuzziness in your comparison for identical spacing.
This is quite close to what GDAL does.
But I realize that putting the |
Oh interesting.... Great minds think alike I guess! :) In your example, the "projection_x_coordinate" and "projection_y_coordinate" are superfluous for clients that work with the "GeoTransform" attribute of the grid mapping, correct? I actually think this is a great solution. If, for GeoZARR, we relax the CF-NetCDF requirement to include the explicit coordinate variables, this structure would "just work" with gdal and not be totally incompatible with CF. So we could add a clause to GeoZarr Coordinates:
(some text more carefully crafted than that? |
yes. They are there to for CF-only capable readers. GDAL aware code will use the geotransform attribute when present.
You probably want to include the equations I put above to remove any ambiguity. https://gdal.org/user/raster_data_model.html#affine-geotransform may also help. Something that also needs to be specified is what is the convention to interpret which "part" of a pixel a georeferenced coordinate matches: that is pixel-corner vs pixel-center convention. Or support both. |
Center seems to be the more sensible default, no? Any reason to support both? |
sounds reasonable
Probably not. This is a endless source of confusion. Actually I believe this topic mixes 2 things, which are often interleaved when this is discussed:
|
I wouldn't mix coordinate with auxiliary variable which is meant for other purpose. Instead, I would suggest:
|
I was not aware that the Can you explain why you want to introduce the RegularAxis and IrregularAxis and use lowerBound/upperBound/resolution rather than the geoTransform as I suggested in #19 ? |
It's just one suggestion. don't know geoTransform and have nothing against it. |
Why would you put these six numbers in a character string, and not in a data variable? |
It's a fair question. I think because, to do that, you would have to declare a dimension and a whole data variable that would have to integrate into the data format. What we want is a compact attribute to contain the values. It's not very common, but you can make numeric vector attributes in NetCDF. I have to admit ignorance with ZARR's handling of attributes though. Will ZARR support attributes that are a vector of six doubles? Answering my own question here. Yes, that would work. I was just being a bit lazy with my "as simple as possible" suggestion. https://zarr.readthedocs.io/en/stable/tutorial.html#user-attributes
|
If there is a remote possibility that the conversion binary -> text -> binary doesn't return the exact binary number you started with, I'd opt for not doing that conversion, as a write & read step will result in a slightly differently positioned raster, and downstream software will say these rasters don't match. |
The issue with binary->text conversion will necessarily occur with Zarr, JSON being a text format:
|
@rouault -- So you are saying that we should adopt an attribute that is a length six double precision vector and that the space delimited netcdf attribute currently used by gdal would be supported but not in a convention? |
yes
That's a GDAL netCDF specific implementation detail. There's no reason to keep it in GeoZarr |
I just made a Pangeo fo post on some of the implications of this issue. I feel I understand it much better now after writing this up: https://discourse.pangeo.io/t/example-which-highlights-the-limitations-of-netcdf-style-coordinates-for-large-geospatial-rasters/4140 We have not settled on the right way to encode binary data in Zarr attrs yet, although there has been plenty of discussion We should find a way to move forward on this. I gave some ideas on the python side in that forum post. |
It would be great for zarr to have support for capturing arbitrary linear mapping from pixel to world coordinates. I can't comment on exact format chosen, GDAL convention or just an Affine matrix with/without the last row (0,0,1), so long as 6 degrees of freedom relationship between pixel and world coordinates can be recorded in zarr without having to render coordinate of every pixel separately. One note: make sure to clearly define where |
Over in the Xarray discussion I have proposed a very concrete task that could be used to advance this issue:
Is anyone here game to try? |
GeoTransform support in GeoZarr aims to bridge the gap with the GDAL community by simplifying coordinate encoding, replacing explicit coordinates with a more efficient representation. GeoZarr also focuses on being data format-agnostic, aligning with the Open Geospatial Consortium (OGC) specifications that have evolved over 20 years to provide an abstract data representation. While the OGC Coverage Implementation Schema and related specifications are complex and challenging to understand, they offer an open and flexible approach. I tried to summarize the basics of both way of describing the link from pixel coordinates to geographic coordinates of gridded data. GDAL GeoTransformIn GDAL,
The projection itself is encoded in a separate spatial reference (SRS), often provided as a WKT (Well-Known Text) string or a PROJ.4 string in the raster metadata, typically stored with the raster file. OGC Coverages (based on CIS]The
ComparisonBoth OGC Coverages and GDAL GeoTransform define the resolution of the data grid, where GDAL uses Pixel Width and Pixel Height, and OGC uses the resolution property. OGC CIS: ❓ support affine transformations ?
|
Wanted to highlight that @benbovy's new Xarray PR - pydata/xarray#9543 - introduces the general concept of Coordinate Transforms to Xarray in a way that will be very useful for the GeoZarr effort. |
We need to add a method for encoding origin / offset coordinate variables where the GeoZarr coordinates are not "...a one dimensional Zarr Array that indexes a dimension of a GeoZarr DataArray (e.g latitude, longitude, time, wavelength)."
It would seem that, in essence, we should encode GeoTIFF metadata in a GeoZarr Auxiliary variable
So instead of:
We would have
If this basic approach is agreeable, maybe @rouault would be willing to suggest an approach to encode origin/offset/transform metadata as attributes?
Is there any sense in tailoring / simplifying / extending the approach in CF 1.10 to suit these needs? https://cfconventions.org/Data/cf-conventions/cf-conventions-1.10/cf-conventions.html#example-Two-dimensional-tie-point-interpolation
The text was updated successfully, but these errors were encountered: