Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Made writing of data output continuous #4

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

alexfrostdesu
Copy link
Contributor

Now if --data-output-enabled we write csv file line by line.

This is actually a very crude version, that assumes that first found row would be the most full. Also will give gibberish on any incomplete row. It just so happens that knowing full info is impossible if we are going row by row. So we need to find a way to safeguard it from incomplete data, missing joints and animals.

Also repeatedly opens and closes the actual csv file, which is not ideal for performance. Again, when we are dealing with endless loop, with open(file) is pretty much useless.

Maybe we need to rethink this idea somehow?

@JensBlack
Copy link
Member

what's the issue with missing data?

@alexfrostdesu
Copy link
Contributor Author

what's the issue with missing data?

Okay, so to write csv file we need to know how many columns there should be from the very first row. We cannot add new columns as the analysis progresses, it would need to restructure the whole file. So the first row should come with the maximum amount of skeletons and joints, which is obviously not always the case.

We don't have such a problem with writing csv after the analysis is done, the sum of all rows give us the fullest number of columns. We can get the first row practically empty and full the table later.

Furthermore, let's assume our first row came the fullest, we still can get some data missing. If, for example, our full header looks like this:
Animal1_nose_x;Animal1_nose_y;Animal1_neck_x;Animal1_neck_y;Animal1_tailroot_x;Animal1_tailroot_y;
and one frame came later with only these joints:
Animal1_nose_x;Animal1_nose_y;Animal1_tailroot_x;Animal1_tailroot_y;
in the case when we have to write csv post-analysis - no problem, it will be put in the table accordingly, like that:
x1;y1;None;None;x3;y3
but in the case when we are writing line by line we can't write it any other way than:
x1;y1;x3;y3;
which is obviously losing us more data.

@alexfrostdesu
Copy link
Contributor Author

Okay, I kinda can get around it for DLC, as it has all joints name in config.yaml, but not really for other networks.

@JensBlack
Copy link
Member

Okay, now I understand.

Sidenote:
With our current handling it is impossible to get :
x1, y1, None, None, x3, y3
we would always get:
None, None, None, None

Anyway:
Skeleton is a dictionary with keys for each bodypart (found in the config.yaml or from ALL_BODYPARTS), so we already know all possible columns from the start (loading the network). Even if not, "None" is not reported by DLC's pose estimation (it always returns the coordinates of the highest probability), we only get it from the thresholding and calculate_skeleton steps. We could take the initial number of bodyparts from the first get_pose call. The same goes for DLC-LIVE and DeepPoseKit (afaik). Sleap returns NaN if it can't find the body part, but always returns a full skeleton (all bodyparts as keys present)

Even if we have no previous bodypart names, the autonaming in calculate_skeleton returns named bodyparts (bp1, bp2, etc.). So we should always get a full set of column names.

If its about empty entries "x1;y1;;;x3;y3", we should consider using a spaceholder. What do you think?

@JensBlack
Copy link
Member

Suggestion: Save data in chunks (what size?) to reduce the open/close hell.

Copy link
Member

@JensBlack JensBlack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change location of CSV_DELIMITER to advanced settings (keep ";" as default)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants