-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
testing pretrained model - depth #15
Comments
@alexsax could you please comment on which models we used for testing? |
Also, did you use mask to mask out depth values that are "bad"? Basically we extracted loss values from mesh, and since there are holes in mesh, some depth values in ground truth are too high (we masked these values out during training). |
@b0ku1 that's a good point. That might explain the relatively high losses. @emergencyd we reported the l1 loss--so that's the one that you should pay attention to. One contributing factor might be that the models that we released were trained on an internal set of images that were processed a bit differently than the released data. The internal set always has a FoV of 75 degrees, but the released data has a range of 45-75 degrees. The pretrained networks don't work as well on images with a narrow FoV, like those in the release set. You can verify this for yourself on the Taskonomy demo site. @emergencyd do you notice that the losses are significantly better for large-fov images? |
@emergencyd changing to rgb-large won't change the FoV (field of view) problem. That's a discrepancy between the internal and public dataset. Basically we trained and tested on images with fixed FoV (internal), but for more general use for public, the release dataset has varying FoV. re mask: @alexsax does the released dataset come with mask? |
Seconding @b0ku1 above. And no need for an explicit mask—just check for pixels where the depth is equal to (or very close to) the max value, 2^16-1 :) Finally, depth Euclidean is the distance from each pixel to the optical center. Depth z-buffer is something else (see the sup mat for the full description!). |
now I can see "full_plus", "full", "medium", "small" splits information of the whole dataset, but can't find the fov information of each image~ where should I get them~ Also, do I need to drop out the pixels with extremely high values? (if I understand it right) |
The pose files :)
Yes, exactly. |
I'm using the pretrained model in folder "rgb2depth". I want to reproduce "loss = 0.35". Which data should I use?
I've tried the "depth_zbuffer" test data, but "l1_loss, loss_g, loss_d_real, loss_d_fake" are around "0.6,0.7,0.9,0.7".
I suppose that I used wrong data or wrong loss... should I use "depth_eucliean" data?
Thank you!
The text was updated successfully, but these errors were encountered: