Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extensible dataset #5

Open
pwm1234 opened this issue Oct 7, 2015 · 5 comments
Open

Extensible dataset #5

pwm1234 opened this issue Oct 7, 2015 · 5 comments

Comments

@pwm1234
Copy link
Collaborator

pwm1234 commented Oct 7, 2015

Much of my work involves processing time samples, so I need to be able to save data with an unlimited axis. Do you have thoughts or suggestions on adding this capability to eigen3-hdf5? If I get something reasonable would you want a pull request or do you want to keep your code free of this additional complexity?

@garrison
Copy link
Owner

garrison commented Oct 7, 2015

What do you mean by unlimited axis? That the matrix (or vector) dimension will grow over time, and cannot be specified beforehand?

@pwm1234
Copy link
Collaborator Author

pwm1234 commented Oct 7, 2015

Yes; its maximum dimension will be unlimited. I believe the correct HDF5 term is extendible dataset. Creating an Extendible Dataset is the basic tutorial, which says

[An unlimited dimension dataspace is specified with the H5Screate_simple / h5screate_simple_f call, by passing in H5S_UNLIMITED as an element of the maxdims array.

What I need is a dataset where each row of data represents the set of measurements for a slice in time, where I do not know in advance how many rows I will need. So the first dimension will be unlimited. For example, a typical IMU measurement consists of 6 doubles. I would need to store each measurement as a row in a 7 column matrix [time, ax, ay, az, gx, gy, gz].

@garrison
Copy link
Owner

I think the most useful thing would be able to save an eigen matrix to some subportion of an hdf5 array on disk. That way, you could store only the recent samples, adjust the size of the hdf5 array, and save the recent samples to disk.

@pwm1234
Copy link
Collaborator Author

pwm1234 commented Oct 16, 2015

But that is the point of an extendable HDF5 dataset--the library does all of this for you. For example, if I have a Vector3d that I want to write out as time marches on, then I have a dataset that is UNLIMITEDx3. I write a vector at a time. The library takes care of adjusting the actual size on disk. If you do this with fixed size datasets, then you have to overallocate and use a fill value, write each row, expand whenever you reach the current size, copy to a new dataset if you exceed the max size, and shrink to actual size when you are finished.

I will play with it and give you a pull request when I have something that works for me.

@garrison
Copy link
Owner

The key missing functionality, though, is the ability to save an eigen array to a sub-porton of a dataset. With this, it would take just two calls to do what you are describing--one to extend the dataset, and the other to write the data. Once these calls exist (and the second call likely already exists in the hdf5 library), it is easy to make a wrapper function that does both at once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants