testing pretrained model - depth #15

emergencyd · 2019-01-19T05:48:53Z

I'm using the pretrained model in folder "rgb2depth". I want to reproduce "loss = 0.35". Which data should I use?

I've tried the "depth_zbuffer" test data, but "l1_loss, loss_g, loss_d_real, loss_d_fake" are around "0.6,0.7,0.9,0.7".

I suppose that I used wrong data or wrong loss... should I use "depth_eucliean" data?

Thank you!

b0ku1 · 2019-01-20T06:41:17Z

@alexsax could you please comment on which models we used for testing?

b0ku1 · 2019-01-20T21:16:01Z

Also, did you use mask to mask out depth values that are "bad"? Basically we extracted loss values from mesh, and since there are holes in mesh, some depth values in ground truth are too high (we masked these values out during training).

alexsax · 2019-01-20T22:00:54Z

@b0ku1 that's a good point. That might explain the relatively high losses. @emergencyd we reported the l1 loss--so that's the one that you should pay attention to.

One contributing factor might be that the models that we released were trained on an internal set of images that were processed a bit differently than the released data. The internal set always has a FoV of 75 degrees, but the released data has a range of 45-75 degrees. The pretrained networks don't work as well on images with a narrow FoV, like those in the release set. You can verify this for yourself on the Taskonomy demo site.

@emergencyd do you notice that the losses are significantly better for large-fov images?

b0ku1 · 2019-01-21T04:52:28Z

@emergencyd changing to rgb-large won't change the FoV (field of view) problem. That's a discrepancy between the internal and public dataset. Basically we trained and tested on images with fixed FoV (internal), but for more general use for public, the release dataset has varying FoV.

re mask: @alexsax does the released dataset come with mask?

alexsax · 2019-01-21T18:49:58Z

Seconding @b0ku1 above.

And no need for an explicit mask—just check for pixels where the depth is equal to (or very close to) the max value, 2^16-1 :)

Finally, depth Euclidean is the distance from each pixel to the optical center. Depth z-buffer is something else (see the sup mat for the full description!).

emergencyd · 2019-01-22T15:30:58Z

now I can see "full_plus", "full", "medium", "small" splits information of the whole dataset, but can't find the fov information of each image~ where should I get them~

Also, do I need to drop out the pixels with extremely high values？（if I understand it right）

alexsax · 2019-01-22T18:34:57Z

now I can see "full_plus", "full", "medium", "small" splits information of the whole dataset, but can't find the fov information of each image~ where should I get them~

The pose files :)

Also, do I need to drop out the pixels with extremely high values？（if I understand it right）

Yes, exactly.

emergencyd · 2019-01-25T14:10:05Z

According to supplementary materiel, I use depth_zbuffer rather than depth_euclidean as my target depth mask.
Then I use the "field_of_view_rads" information to pick the images with larger than 1.3 rads.
Then I use the code below to process the target image and calculate l1-loss

        #####################
        #### target data ####
        #####################
        img_t = load_raw_image_center_crop(target_name, color=False) 
        mask_filt = np.where(img_t >= 2**16 - 1, 0, 1) 

        if 1:
            img_t[img_t>=2**16-1] = 0
            img_t[img_t==0] = np.max(img_t)    
        
        if 1:
            img_t = cfg['target_preprocessing_fn']( img_t, **cfg['target_preprocessing_fn_kwargs'])   
        else:
            img_t = load_ops.resize_image(img_t, [256, 256, 1])
            
        img_t = img_t[np.newaxis, :]
        
        if 1:
            mask_filt = load_ops.resize_image(mask_filt, [256, 256, 1])        
            weight_mask = mask_filt[np.newaxis, :] 
        else:
            weight_mask = np.ones(np.shape(img_t))   
            
        #####################
        ###### predict ######
        #####################            
        predicted, representation, losses = training_runners['sess'].run([m.decoder_output, m.encoder_output, m.losses], feed_dict={m.input_images: img, m.target_images: img_t, m.masks: weight_mask})

Have noticed that there is a function "depth_single_image", I try it and calculate the loss again:

        if 1:         
            predicted = depth_single_image(predicted)
            diff = np.abs(predicted - img_t)
            diff[weight_mask == 0] = 0
            l1_loss = np.sum(diff)/(np.sum(weight_mask))

But still, the loss seems not right(around 0.15). But this time, the generated prediction pic seems same as the result in the demo website:

I guess there is something wrong with my processing of target images, and I'm quite confused now.

@alexsax

b0ku1 assigned alexsax Jan 20, 2019

emergencyd closed this as completed Jan 22, 2019

emergencyd reopened this Jan 22, 2019

JunyuanDeng mentioned this issue May 15, 2024

camera parameters #79

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

testing pretrained model - depth #15

testing pretrained model - depth #15

emergencyd commented Jan 19, 2019 •

edited

Loading

b0ku1 commented Jan 20, 2019 •

edited

Loading

b0ku1 commented Jan 20, 2019

alexsax commented Jan 20, 2019

b0ku1 commented Jan 21, 2019

alexsax commented Jan 21, 2019 •

edited

Loading

emergencyd commented Jan 22, 2019

alexsax commented Jan 22, 2019 •

edited

Loading

emergencyd commented Jan 25, 2019 •

edited

Loading

testing pretrained model - depth #15

testing pretrained model - depth #15

Comments

emergencyd commented Jan 19, 2019 • edited Loading

b0ku1 commented Jan 20, 2019 • edited Loading

b0ku1 commented Jan 20, 2019

alexsax commented Jan 20, 2019

b0ku1 commented Jan 21, 2019

alexsax commented Jan 21, 2019 • edited Loading

emergencyd commented Jan 22, 2019

alexsax commented Jan 22, 2019 • edited Loading

emergencyd commented Jan 25, 2019 • edited Loading

emergencyd commented Jan 19, 2019 •

edited

Loading

b0ku1 commented Jan 20, 2019 •

edited

Loading

alexsax commented Jan 21, 2019 •

edited

Loading

alexsax commented Jan 22, 2019 •

edited

Loading

emergencyd commented Jan 25, 2019 •

edited

Loading