You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One of the nice things about SigMF is that it stores the signal metadata in a human readable format. I feel like the dataset format is an exception. For someone who is used to the format, I'm sure that it is simple to look at a string like "ru16_be" and parse the meaning. But for a novice user, they really need a decoder ring. In my opinion, this adds unnecessary friction to adoption of the format.
I would propose that the dataset format be expressed in plain English. I don't have strong opinions on what the specifics of it are, but I personally like something like:
'dataset-format' = {
'space': 'complex',
'type': 'signed',
'bit-width': 16
'endian': 'little'
}
This does take up more space, in the file, but I think it would be much clearer for users. Using a format like this will also make it easier in the future if new types are added. For example, if we wanted to add 16-bit floating point support ( a format commonly used in GPU processing), there are multiple standards, IEEE-754 and bfloat16. This would be difficult to specify with a single character (as is done now), but very easy to express in plain English using the "type" field above or something similar.
The text was updated successfully, but these errors were encountered:
I think this is a fair point. This suggestion would also make it possible to quickly check individual aspects of the datatype without parsing the entire string (although I'm not sure if this is actually useful).
I'm on-board with doing this, but am not certain about timing. My gut, here, is that we mark this for post-v1.x. We cannot remove the existing approach without breaking backwards compat, which means this would have to get added in parallel, at which point we've made it possible to create two critical fields that disagree with each other unless we mark the existing method as deprecated, which is a huge shift for existing applications. Short of it is this is too messy, IMO.
I'm going to mark this for v2.x - open to disagreements / opinions otherwise.
The eventual plan would be to deprecate the current way of expressing type, which means that both formats will have to be supported for a time. I don't think there is a need to have this in place for v1.0, but I wouldn't want to wait until 2.0 to use it at all. I would hope that it could be added in sometime in the 1.0 series as an option and then in v2.0 we deprecate the current datatype format.
When it is added, I would requre a metadata file to use either the old way or the new way, but never allow both (which is easy to do in json schema). So there would never be a case of datatype disagreement. (If you meant some else by "two critical fields that disagree with each other" let me know)
One of the nice things about SigMF is that it stores the signal metadata in a human readable format. I feel like the dataset format is an exception. For someone who is used to the format, I'm sure that it is simple to look at a string like "ru16_be" and parse the meaning. But for a novice user, they really need a decoder ring. In my opinion, this adds unnecessary friction to adoption of the format.
I would propose that the dataset format be expressed in plain English. I don't have strong opinions on what the specifics of it are, but I personally like something like:
'dataset-format' = {
'space': 'complex',
'type': 'signed',
'bit-width': 16
'endian': 'little'
}
This does take up more space, in the file, but I think it would be much clearer for users. Using a format like this will also make it easier in the future if new types are added. For example, if we wanted to add 16-bit floating point support ( a format commonly used in GPU processing), there are multiple standards, IEEE-754 and bfloat16. This would be difficult to specify with a single character (as is done now), but very easy to express in plain English using the "type" field above or something similar.
The text was updated successfully, but these errors were encountered: