-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standardization of common extrabytes #37
Comments
3.) I think for data_type we can make a "best" and "worst" practice recommendation that shows what scales and offsets use the fewest bytes possible to get reasonable resolution. |
Other useful stuff under the "common" category. Not fleshed out but just introduced as ideas: Group ID - this is a 32 bit (64 bit?) unsigned int that is used as a group index (or Object ID, OID). For example, all points that "belong" to a specific building roof identified as Object ID 246 will have this tag set to 246. Relationship maps can be built in a VLR, EVLR. Sigma X, Y, Z --> Standard deviation of the point expressed in the units of the projection. For Geographic data, the units are meters. The values are doubles or they could just be long and follow the point scaling. Normal - 3 tuple that defines the normal to the surface at the point of incidence. Direction is opposite the ray direction (toward the laser scanner). |
Two issues with the Normal that @lgraham-geocue suggests. (1) Directions are always troublesome because they are difficult to re-project correctly when going from one CRS to another. In the PulseWaves format we've solved this by expressing directions vectors as two points. Re-projecting both is always going to be correct (even if we go to non-euclidean space). How about a "trajectory" index instead that reference "trajectory points" stored in the same LAS file (but marked synthetic) that are on the trajectory. These "trajectory points" are then given the same index so they can be paired up with the actual returns. (2) Triplets have been deprecated. |
@rapidlasso What is there to be gained by allowing the data_type to vary for a given ExtraByte definition? I've noticed that you allow it to vary for the "height from ground" extrabyte in your tools, but that's caused my implementations a little trouble when some files have it defined one way while others have it defined another way. I guess this begs the question of why we're standardizing. I believe it's to encourage implementation by more software vendors, which means simplification is key. In my opinion that means guaranteeing a 1-to-1 relationship of the key attributes with a certain EB code. At a minimum I think What do you think about releasing a series instead? e,g, "height from ground [cm]" with |
@rapidlasso Good point about the difficulties with reprojection. I've had this struggle with the Origin vector of FWF data, and I've often wondered whether those vectors are getting modified correctly. Unfortunately, if the points get shifted (e.g., from calibration) I doubt whether any software would also update the point coordinates. That's the advantage of the vector. As you point out, though, the disadvantage is that they're only valid for a given projection. |
Just always store the normals in ECEF.
|
@esilvia, not feeling strongly about the data type issue. Your suggestion is also good as it would prohibit folks from storing "height above ground" or "echo width" as floating point numbers. Now that is something that I really do feel strongly about. How do I allow different data types in LASlib? I have a "get attribute as float value" function to use "extra bytes" for processing so the actual storage format of the extra attribute does not matter in my implementation. |
@esilvia and @lgraham-geocue my suggestion is to start this standardization with very few (two or three) additional attributes that are likely to be used or that are already used. "Height above ground" is an obvious candidate for derived (i.e. not new) information. "Echo width" is an obvious candidate for additional (i.e. new) information. I would recommend to start with just those two and see how it works out before adding a larger number ... |
Yes, I agree. The more complex, the lower the adoption rate. I would like to see Group added in this initial change. It is just an unsigned long (4 byte) or unsigned long long (8 byte). In the initial version, there would be no restrictions on its use other than initializing to zero (meaning no group membership). We could write a short "best practices" on using Group but it would only be a guideline, not a requirement. |
@lgraham-geocue I like the idea of a GroupID/ObjectID attribute. Should the What if there are two different kinds of groups that a point could belong to? Should we include recommendations for supporting multiple attributes of the same kind? e.g., GroupID[1], GroupID[2], etc? |
Any preference on how to differentiate between the 32-bit and 64-bit ObjectID definitions? LongObjectID for 64bit? |
IT may complicate it a bit. A simple 32 or 64 bit Group number would probably be a good start (If we have only one, I would prefer 64 bit).
|
Here's an update to the proposed standard extrabytes. |
And here's another update including some of the feedback I got this summer at JALBTCX, adding the horizontal and vertical uncertainty fields. |
The "Range" which is "defined as the three-dimensional distance from the sensor to the point, the range is useful for multiple computations such as intensity attenuation and measurement bias." is suggested to be of data type float. I vehemently oppose that. The data type should be an unsigned integer (or even just an unsigned short) with a scale that is similar to that of the LAS points (or less precise) and an offset of zero. |
Are the tuples and triples finally deprecated? I'd like to completely remove them from LASlib. They never were properly supported and I've never seen them used anywhere. |
I suggest we start with one or two or three standardization that are reasonably simple. My votes go to:
|
I have never encountered them being used. Maybe Howard (Butler) is using them for something? I think he was the one who advocated for these structures.
|
Completely agree. In addition, all distance units in the file should be (we would say “must” in the spec) be in the vertical units of the Spatial Reference System of the file. I say vertical units because, in the USA, there are still some “official” SRS with horizontal in feet (INT, Survey?) and vertical in meters.
|
LAS abdicates responsibility for the coordinate system by handing it off to WKT. I disagree that the specification should get involved here, because the spec and the SRS are inevitably going to get into conflict. LAS should investigate requiring OGC WKT2 in a future revision. WKT2 handles more situations and is more complete. See https://gdalbarn.com/ for some discussion related to the GDAL project on the topic (thanks for the contribution @lgraham-geocue!)
Triplets are common in graphics scenarios, and I proposed them thinking they would be well aligned with LAS. They aren't, and they introduce as many problems as they might solve. Few softwares produce or consume them. They should be dropped. No one will miss their removal. |
@rapidlasso Tuples and triples will be officially dropped with the next revision (#1). I agree that range could be confusing because of potential desynchronization with the SRS units, but I believe that fixing it at meters and leaving it with the points prevents its loss when the trajectory files inevitably get lost. We hard-code units for the angles (at degrees), so I don't see why we can't do this with Range. Software can easily change units displayed while leaving the units stored untouched. You've persuaded me that starting with a small handful is a good idea, and I like Martin's list. I'm tempted to add the topobathy-related ones, but perhaps that's better left in the LDP? |
To be LAS 1.4, PDRF 6 compliant, a LAS file must have the vertical units encoded in the WKT. USGS has been rejecting data that do not have the units tag properly set.
We are inviting a space probe collision error by allowing mixed linear, unlabeled units, especially for states who do not use meters for anything.
|
@esilvia "Tuples and triples will be officially dropped with the next revision". Happy to hear that. I just kicked them out of LASlib last week ... (-: |
@lgraham-geocue I disagree. A range is - similar to the scan angle - something measured by a scientific measurement instrument and should follow international standards. I could see how your argument could apply to "height above ground" but even here I'm leaning to always making the measurement unit part of the standardized "extra bytes" because (1) the CRS often gets stripped, (2) reprojecting coordinates from a feet-based CRS to a meter-based CRS (or vice versa) without rescaling the "extra bytes" leads to wrong ranges / heights above ground, and (3) the best choices in scale and offset change when we go from meter to feet. A scale factor of 0.01 may be good when measuring the range or the height above ground for an airborne scan in meters but it is overly precise for feet. We will open a whole can of worms of "extra bytes" that do not have the correct unit or where we do not know the correct unit if we let the vertical unit of the CRS decide this. |
@rapidlasso You make a strong point regarding the scale/offset also being unit-dependent. I think that's also a strong argument in favor of fixing the units. @lgraham-geocue observed that the horiziontal and vertical units can be different in LAS files, which is something I've also observed to my chagrin. Since Range is a 3D measurement, it could get very, very weird if the vertical units are meters and horizontal units are feet. I think this is another argument in favor of fixing the units at meters. I can be persuaded that the height-from-ground will match the vertical units of the LAS file. Simple, and I think it's what people would expect when they receive data. So here's the plan: I'm going to publish the following "Standard" extrabytes as a GitHub wiki page (https://github.com/ASPRSorg/LAS/wiki/Standard-ExtraByte-Definitions):
All of these Standard ExtraBytes will be assigned an integer value (ID) that can be assigned to the first two bytes of the ExtraByte definition structure (currently Reserved). It's a little longer than Martin's list but I think it captures the ones I've seen in the wild. I didn't get any feedback on incorporating the ExtraByte definitions from the topobathy LDP, so I decided to include the ones that I've seen most often. Rather than include these definitions in the specification itself, I'll update the ExtraByte VLR description in the specification with a link to the wiki page and claim the two Reserved bytes for the ID field, which must be 0 unless it adheres to one of the definitions on the wiki page. All of these changes will be included with the R14 revision, which I plan to submit to ASPRS in the next week or two. Last chance to comment. @rapidlasso @lgraham-geocue @csevcik01 @hobu @anayegandhi @jdnimetz |
Do the TPU values require error values that are in increments of femtometer close to zero but drop exponentially to increments of decimeters close to one million? Then a float32 representation is suitable. If the increments with which the error is to be expressed should be a constant centimeter or millimeter throughout the entire error value range then an unsigned integer scaled by 0.01 or 0.001 is the correct approach. |
The range of plausible vertical uncertainty values, considering the range of possible data sources, is probably meters to millimeters. Adding one order of magnitude in either direction gives tens of meters to tenths of millimeters. If we use a scale factor, where is the scale factor stored? Is it the same as for the X Y Z coordinates? |
The "scale factor" is core part of the "extra byte" definition. I recently published a little blog post on how to use txt2las (which is open source) to map the information stored in these kind of ASCII lines of LiDAR information to the LAS format and you see examples with different numbers of decimal digits being used there: 1, 290.243, 28.663, -11.787, 0.060, -0.052, 0.997, 517.3170, -58.6934, 313.0817, 52 |
The "beam ID" seems a rather easy first candidate for standardization. Clearly there is a need and clearly users already store this information to "extra bytes" like here as "Velodyne Rings". In this blog post I describe how to copy the beam ID from the "point source ID" field or from the "user data" field into a new "extra bytes" attribute with two calls to las2las, namely
|
Hi there, checking in to see if there is now a ~standardized way to store surface normals within a las/laz file extra bytes fields. There have been discussions on this thread regarding the way to store normals (3 floats xyz coords vs more efficient way to store a unit vector), but nothing regarding a commonly-agreed way to name these attributes eg PS I read the type 22 tuples and triples has been deprecated so the surface normal coordinates should be stored as individual field. PDAL defined the |
After digging through a bit, here is what I found for normals naming convention for las files:
PS: if a dedicated github issue should be opened specifically for the normal coords naming convention, please do tell me! |
The normals naming convention that we've adopted internally is
|
Thanks for clarifying, I did not know |
Sorry, just to clarify when I wrote "we've adopted..." I wasn't commenting on behalf of ASPRS or the LAS committee. That's just the convention we've ended up adopting within my company. For us that convention isn't specific to LAS either, but we needed a general convention to follow for data formats/applications to don't have their own definition. |
Woops sorry about the confusion and thanks for the contribution! Awaiting feedback from the LAS committee then. |
As you're finding out in PDAL PDAL/PDAL#4144 and maybe other places, I don't think this exists. We chose the |
I think most of the names came from various data files that we encouraged
in the wild.
…On Thu, Aug 31, 2023, 8:57 AM Howard Butler ***@***.***> wrote:
Hi there, checking in to see if there is now a ~standardized way to store
surface normals within a las/laz file extra bytes fields.
As you're finding out in PDAL PDAL/PDAL#4144
<PDAL/PDAL#4144> and maybe other places, I
don't think this exists. We chose the NormalX composition because PDAL
uses strings to define dimension types and it seemed both explicit enough
and clean enough without spaces.
—
Reply to this email directly, view it on GitHub
<#37 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKBMMHN5X4FTK4P7AE7XFLXYCC3TANCNFSM4D36PXTA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
A while ago I became aware of a single-field convention being used in TerraSolid for this purpose... if I remember correctly they had packed the normal_x/y/z values into a 64bit or 32bit integer ExtraByte. It'll take me a while to dig that out of my notes if it's of interest. Beyond that, the waveform PDRFs do include a directional vector that I've seen leveraged as a surface normal, but I wouldn't recommend it. |
Just found my notes. In TerraSolid they store the Normal Vector ExtraByte as a 32bit integer -- 2 bits for the "model type" and then 15 bits for each polar angle. I don't have more detailed notes than that, but maybe someone has a contact at TerraSolid that can provide more detail? Related: #1 (comment) |
One final comment... I found an offline discussion thread between me, Martin, and the folks at AgiSoft about a method of encoding that may resemble their final implementation of normal vectors. I'm uploading it here now that we have a more public way to have these discussions. Beginning a new discussion thread on normal vectors may be helpful for posterity. |
Hello Evon, thanks for sharing our discussion. We decided to use lower case name with space for normal vector extrabytes because this convention was used in other places we could find: It seems a good idea to keep consistent convention for extrabyte names. LAS 1.4 R15 specification includes several examples of extrabyte names using lower case naming convention with spaces. It also has a link to official LAS wiki with Standard ExtraByte Definitions list that has even more examples, all using the same lower case naming convention. It makes me believe that it is an accepted naming convention for LAS extrabytes. If lower case naming convention with spaces is indeed preferred for LAS extrabytes, then maybe adding support for this convention in PDAL is a better solution. Please correct me if I am wrong and camel case convention should be preferred instead (does it need to be reflected in the LAS specification then?), or if there are any good reasons to ignore consistency in this case. Dmitry |
PDAL supports names in spaces, but since it supports so many other formats, it discourages them and normalizes them out.
The specification doesn't say and I think it is up to implementers to do what they want. That said, commonly agreed upon dimension names that mean specific things (like nx) would be a welcome community resource. We have https://github.com/PDAL/PDAL/blob/master/pdal/Dimension.json for PDAL but it is just a list more than an attempt at getting anyone to agree on name + meaning. |
Thanks Dmitry @agisoft-metashape ! I've been following the lowercase-with-spaces-allowed convention for all of the standardized ExtraBytes in the wiki as a way of normalizing them, and I intend to continue that trend unless I hear a compelling reason to abandon it. Nevertheless, as @hobu notes the specification itself is silent on a preferred naming convention. |
Thank you @esilvia and @hobu for your comments. We are going to add support for PDAL style names ('NormalX', etc) on import in the next Metashape version. For export we will probably keep using our current lower case names until there is a consensus on the better solution. @hobu PDAL 2.3.0 indeed normalized 'normal x' name to 'NormalX' on input, but that is not the case with PDAL 2.5.5 any more. If such normalization is indeed intended, maybe it can be added to PDAL again? This will probably solve reported compatibility issue. |
Spaces in names is yuck. Other than that, I have no opinion. |
Thanks Evon for sharing this conversation. |
I'm at JALBTCX this week discussing the LDP v2 that was developed in #117 and approaching publication. This includes updates to the ExtraBytes being used, and I'm trying to decide whether or not I should rename the ExtraByte fields in the v2 upgrade. Can @abellgithub or anyone elaborate on "spaces in names is yuck"? I get that the dev community has historically avoided spaces like the plague (windows path errors, anyone?), but any sane language or library should be able to handle spaces in a char array, especially one that interacts with LAS which explicitly specifies that its char arrays are null-terminated. Why would removing spaces be a need here? Are folks trying to use ExtraByte names as variable names or key values? Or is this not a need I should attempt to support? |
My main reason to avoid spaces is that they mean you have to remember to
quote or escape them any time you might want to use that name in a command
line interface.
Kirk
…On Thursday, November 30, 2023, Evon Silvia ***@***.***> wrote:
Spaces in names is yuck. Other than that, I have no opinion.
I'm at JALBTCX this week discussing the LDP v2 that was developed in #117
<#117> and approaching publication.
This includes updates to the ExtraBytes being used, and I'm trying to
decide whether or not I should rename the ExtraByte fields in the v2
upgrade.
Can @abellgithub <https://github.com/abellgithub> or anyone elaborate on
"spaces in names is yuck"? I get that the dev community has historically
avoided spaces like the plague (windows path errors, anyone?), but any sane
language or library should be able to handle spaces in a char array,
especially one that interacts with LAS which explicitly specifies that its
char arrays are null-terminated.
Why would removing spaces be a need here? Are folks trying to use
ExtraByte names as variable names or key values? Or is this not a need I
should attempt to support?
—
Reply to this email directly, view it on GitHub
<#37 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA5B33KJL7GAVG6KUMDCYNTYHDY3RAVCNFSM4D36PXTKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBTGQ2TMNBXGI4Q>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Sent from a mobile device subject to autocorrect errors.
|
@kjwaters OH that makes total sense. I'll propose that change in the Bathy Working Group meeting tonight, if I get a chance. |
We've discussed listing some standardized extrabytes either in the specification itself or a supplementary document. This would encourage their adoption by the community and formalize common extrabytes as a guideline to future implementations.
We need to figure out the following:
data_type
be formalized?Below is a link to what I think is a decent start to the standardized extrabytes. Once we get some agreement on a few of these I can start building a wiki page or contribute to Martin's pull request. Which one we do depends on the answer to the 4th question.
Standard ExtraBytes v1.docx
The text was updated successfully, but these errors were encountered: