Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

npTDMS drops datatypes on properties, which makes rewritten files (sometimes) different from originals #269

Open
drillabit opened this issue Apr 14, 2022 · 4 comments

Comments

@drillabit
Copy link

When using software that checks or relies on the datatype entries for properties, copying (and patching, that's what i do) TDMS-Files with npTDMS is not possible, as npTDMS drops the type information stored with the properties in the file while reading, so that dictionaries created by npTDMS do not contain this information any more.
Instead, when writing, the type is deduced from the value in the dictionary. So a number 10, stored as a UInt64 in the input file, will come out as an Int32 in the output-file.

For my purposes i fixed that in tdms_segment.py by changing line 620 in read_property() from
return prop_name, value
to
return prop_name, prop_data_type(value)
This works for me, as the writer already uses types, if provided. I did not encounter any immediate problems within npTDMS, but, of course, when working with the properties in the dictionaries, one would have to access the raw values in the types-classes differently.

@adamreeve
Copy link
Owner

Hi @drillabit, thanks for opening this issue.

Yeah I don't think we'd want to make that exact change, as you say it would affect anyone wanting to access the property values and would be a fairly big breaking change.

But it should be possible to add the type information to the properties dictionary in a non-breaking way, so that if you pass the properties dictionary (or its containing channel/group) to TdmsWriter.write_segment it can make use of the type information. Eg. I'm imagining that instead of properties being a plain dictionary, we could use a subclass of dict that adds a tdms_types property that is a dictionary mapping property names to their types.

@drillabit
Copy link
Author

Thanks for the quick response @adamreeve ! I agree in that there should be some way to pass the types frome the reader to the writer, so that reading and writing does not alter the file. It does not have to be via typed values. That was just the easiest way to fix it, but not the most backwards compatible one. Nevertheless i think it is quite elegant. So i would also consider an additional option to the read method, to switch to typed values, if reqired.

@adamreeve
Copy link
Owner

adamreeve commented Apr 18, 2022

Yes, that could be nice as it would make the typed properties more easily accessible to users, although means users would need to opt in to that behavior if they want the property types to round trip correctly when writing them out to another file. Another alternative would be for TDMS objects to have a separate typed_properties property that is a dictionary of the typed properties, and which the writer could use if present. Would that also work for you? It could just be a bit fiddly making sure the typed and untyped properties are kept in sync when they're modified.

@drillabit
Copy link
Author

The typed_properties extra dictionary would work for me! Let's go for it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants