Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GEOS policy for Z ordinate value population? #1087

Open
vadimcn opened this issue Apr 30, 2024 · 18 comments
Open

GEOS policy for Z ordinate value population? #1087

vadimcn opened this issue Apr 30, 2024 · 18 comments

Comments

@vadimcn
Copy link
Contributor

vadimcn commented Apr 30, 2024

I've observed that in many GEOS operations, such as clipping and interpolation, new polygon vertices are created with a Z coordinate set to NaN. This complicates handling elevation data, as it necessitates manual interpolation of missing values, leading to bugs and sometimes making correct interpolation impossible (e.g. for endpoints of LineStrings resulting from clipping).

It appears that functions like LineSegment::pointAlong could be modified to also interpolate the Z coordinate.

I would like to understand the project's perspective on this issue. Is improved Z handling something GEOS is interested in, or is the omission of Z values a deliberate choice?

A similar question could be raised regarding 'M' values, although the appropriate method for interpolating them is less clear.

@pramsey
Copy link
Member

pramsey commented Apr 30, 2024

I am surprised you are seeing missing Z's in clipping, as the constructive geometry overlay engine does try to in-fill the new points with interpolated Z. Which gives you a hint as to our policy, which is that we're "OK" with it, and for places that currently do not do it, would be fine with adding it.

@vadimcn
Copy link
Contributor Author

vadimcn commented May 2, 2024

I am surprised you are seeing missing Z's in clipping

I was using shapely's clip_by_rect on a LineString (which maps to GEOSClipByRect, as I understand). This does work when using the general intersection function.

Which gives you a hint as to our policy [...]

I think there should be explicit guidelines on how Z coordinates are handled, to avoid confusion among project users and contributors. Although the expected result is clear in the example mentioned, there are other scenarios where it's less obvious. For example:

  • What happens when intersecting two geometries that both have Z coordinates? Should the Z values of newly created points derive from the first geometry, the second, or a combination of both? (FWIW, for both intersection(linestring_z, polygon_z) and intersection(polygon_z, linestring_z), the Z coordinate of the resulting geometry is always that of the linestring_z)
  • What should occur when clipping a polygon with another smaller polygon? Should the Z values of newly created points conform to the 3D plane of the original polygon? And what if the original polygon wasn't planar?
    Empirically, intersection of POLYGON Z ((2 -2 1, 2 2 2, -2 2 1, -2 -2 4, 2 -2 5)) and POLYGON ((1 -1, 1 1, -1 1, -1 -1, 1 -1)) produces POLYGON Z ((-1 -1 4, -1 1 1, 1 1 2, 1 -1 3, -1 -1 4)). What rule was used for deriving Z?
  • How should normalization be handled for polygons and lineRings if the Z coordinates of the first and last points differ?

And so on...

@tfiner
Copy link

tfiner commented May 2, 2024

I just got bit by this define deep down in CoordinateSequence:
GEOS_COORDSEQ_PADZ which erroneously makes space for Z's, but then crucially, does not set hasZ!
Yet, when you ask a CoordinateSequence for its CoordinateType, it returns XYZ.
Sigh. If I bother to construct a CoordinateSequence, and I pass in hasZ=false, then I expect the CoordinateSequence to be: XY. Likewise when I pass in a 2 for dim, it ignores that and makes it an XYZ, yet because it doesn't set hasZ, it also lies and returns getDimension() == 3.

It is never a good design to lie in an API, be explicit, and if you want a default with dimension==3, then make that explicit, but give users the ability to make 2 dimensional CoordinateSequences...

@dr-jts
Copy link
Contributor

dr-jts commented May 3, 2024

I just got bit by this define deep down in CoordinateSequence: GEOS_COORDSEQ_PADZ which erroneously makes space for Z's, but then crucially, does not set hasZ!

@tfiner This should be filed as an issue.

@tfiner
Copy link

tfiner commented May 3, 2024

Done, see #1090

@dbaston
Copy link
Member

dbaston commented May 3, 2024

I just got bit by this define deep down in CoordinateSequence:
GEOS_COORDSEQ_PADZ which erroneously makes space for Z's, but then crucially, does not set hasZ!

This is by design - see the comment where GEOS_COORDSEQ_PADZ is defined.

hasZ() reports whether the sequence stores XYZ coordinates, which is different from whether it is capable of storing XYZ coordinates.

@tfiner
Copy link

tfiner commented May 3, 2024

I read the comment (buried in the cpp file mind you), I am saying that it was not a good design choice. Offering dimension parameters that are ignored is surprising to say the least, even if it helped someone trying to load 3D points. They should have to specify that, or, you could use the type system so that it would be impossible to screw up.

@dbaston
Copy link
Member

dbaston commented May 3, 2024

It doesn't have anything to do with helping someone trying to load 3D points. It has to do with incrementally fitting a variable-dimension capability into a 22-year old library that assumed for 20 years that all coordinates are XYZ. I think this project is pretty receptive to improvements, so if you have a proposal to just use the type system I'd encourage you to share it.

@tfiner
Copy link

tfiner commented May 3, 2024

I don't know enough about GEOS, but does it make sense to make templated collections like good old STL? Another option might be to extend the type system to the collections, you already have CoordinateXY and CoordinateXYZ, etc. maybe CoordinateSequence2d vs. CoordinateSequence3d, etc.

BTW, that comment for the define says: "This prevents incorrect results when an XYZ Coordinate is read from a sequence storing XY or XYM.".. You can see why I thought that this was to help someone load XYZ.

That comment also implied that the define was temporary, not something to cope with a 22 year old library. I am new here, and I mentioned this because I needed to project a bunch of points in some 2d polygons, and could not bring myself to project them, one at a time. So I started looking at the pointers to the Coordinates and was completely surprised that although I had given them 2d Coordinates, I had a whole bunch of Nans in between my XY's.

@hobu
Copy link
Member

hobu commented May 3, 2024

I don't know enough about GEOS, but does it make sense to make templated collections like good old STL? Another option might be to extend the type system to the collections, you already have CoordinateXY and CoordinateXYZ, etc. maybe CoordinateSequence2d vs. CoordinateSequence3d, etc.

The place to type yourself to death is Boost::Geometry, but I don't think you'll find life so much better than GEOS as far as dealing with coordinate attributes other than XY. You'll be frustrated with type casting costs, wonder why it makes you differentiate coordinate systems in the type system, and lament having to write your own implementations of basic utilities to i/o common serializations into the library.

If you want a library for high dimensionality points, GEOS isn't what you want. PDAL (GEOS-style memory model) or PCL (types! types! types!) might be what you want, but you haven't stated exactly what you're trying to do.

@tfiner
Copy link

tfiner commented May 3, 2024

Thank you Howard, I am a huge fan of PDAL. Yeah, I tried boost geom awhile back, whew, painful.

It sounds like GEOS has a lot of old code that assumes multidimensional points, up to 4. My point was that the coordinates have types, but that the containers don't. My point about types is that if you are going to play with raw pointers, you'd better tell the truth about them, always.

I was mostly griping that the interface for the CoordinateSequence isn't consistent. What I was trying to do was project points using proj. I wanted a way that, given a CS, I could efficiently transform a few million points in one call. The fastest way to do that is to give proj x,y pointers and their strides. You can't get the stride(), because it is private. You can't ask for hasZ(), because it lies (says 2, when it allocated for 3), so you have to ask for the type, which is XYZ and infer the stride from that. The documentation and the header aren't good enough to tell you that, so you must spelunk the code to find a define to find out why.

All this made me think, there has to be a better way of using types in the interface to prevent whatever invariant is desperately trying to be preserved here to the point that explicit dimension parameters are ignored. If you have to have 3 dimensions, don't pretend to support 2.

@vadimcn
Copy link
Contributor Author

vadimcn commented May 3, 2024

@tfiner @hobu @dbaston I feel that you are discussing a somewhat different issue than the one I've raised originally. Let's not conflate them by mixing in same thread.
May I ask you to please move your comments into the newly created issue #1090 and delete them here? Thanks!

@vadimcn
Copy link
Contributor Author

vadimcn commented May 3, 2024

@pramsey I'd love to contribute a fix to GEOS, but I'd like a hint on how to handle the ambiguous cases such as mentioned here. Even if it's just "It's unspecified behavior, do whichever is easier".

@dr-jts
Copy link
Contributor

dr-jts commented May 3, 2024

I'd like a hint on how to handle the ambiguous cases such as mentioned here. Even if it's just "It's unspecified behavior, do whichever is easier".

Agreed, there should be a wider policy for Z (and M) handling in GEOS. Best place for this is probably somewhere on https://libgeos.org/ (perhaps as an RFC initially, but ideally under a Specification heading. ). This could start as an RFC issue, which allows commenting (perhaps can just use this issue?)

@dr-jts
Copy link
Contributor

dr-jts commented May 3, 2024

There appears to be two parts to your question:

  1. why do some (most?!) operations not do something sensible with Z
  2. what are the policies around computing Z (and M) values for new coordinates?

For (1), the simple reason is that the effort to define and implement Z handling has not been made, due to lack of (prior) demand and resources. PRs will be welcome for this (which will be assisted by some work to address 2)

@dr-jts
Copy link
Contributor

dr-jts commented May 3, 2024

Here's some thoughts for general policies on Z handling:

  • Z values for constructed coordinates should be interpolated based on either (a) neighbouring coordinates or (b) an interpolated 3D planed based on the input Z ordinates. (a) is simpler to implement and computationally more performant, so can be the default approach unless otherwise required. (In particular (a) can be used for for algorithms on lines such as LineSegment methods)
  • for operations which are affected by inconsistent Z input (such as normalizing rings with different start and end Z values) the first Z value should be used, unless this is computationally more expensive than using the final Z value. (Yes, this is slightly under-specified - that reflects the inconsistent nature of the input. GIGO).

More suggestions welcome.

@dr-jts
Copy link
Contributor

dr-jts commented May 3, 2024

See locationtech/jts#626 for a specification of the Z handling added for OverlayNG in JTS (which AFAIK applies to GEOS as well).

This should be summarized in the doc for the OverlayNG class (in both projects). And possibly some of the material can be provided in a page on https://libgeos.org/

@dr-jts dr-jts changed the title GEOS stance on Z coordinate GEOS policy for Z ordinate value population May 5, 2024
@dr-jts dr-jts changed the title GEOS policy for Z ordinate value population GEOS policy for Z ordinate value population? May 5, 2024
@vadimcn
Copy link
Contributor Author

vadimcn commented Jul 27, 2024

Update: I've fixed two cases of not preserving Z in PRs #1094 and #1095.

Not sure if this issue should be closed now, as it has useful information. Maybe you could enable discussions on GEOS repo and move it there?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants