Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Summary of differences between paper and code #78

Open
t-taniai opened this issue Mar 14, 2020 · 3 comments
Open

Summary of differences between paper and code #78

t-taniai opened this issue Mar 14, 2020 · 3 comments

Comments

@t-taniai
Copy link

Thank you very much for you interesting work and useful code!

When I was reading the code, I noticed there were several differences between the descriptions in the paper and implementations in the code. In the followings, I'm trying to summarize those differences.

  • Architecture:
    • Paper: GANet-15 (15x 3D-Convs and 5x GA layers from GitHub readme)
    • Code: GANet-Deep (22x 3D-Convs and 9x GA layers from GitHub readme)
    • This is somewhat confusing to me. Network architecture in arxiv paper shows GANet-15 has 3 SGA and 2 LGA layers, so I guess "5 GA layers" above means 3 SGA + 2 LGA layers. This makes sense because GANet-Deep code has 7 SGA and 2 LGA layers, which matches with 9 GA layers. However, GANet-11 code uses 4 SGA and 2 LGA (totally 6 GA layers), so GANet-11 somehow uses more SGA layers than GANet-15... (?)
  • Loss y=f(|d-d'|)
    • Paper: Smooth L1: y=0.5*x**2 (x<1), x-0.5 (x>=1)
    • Code: y=x**2/a (x<a), 2x-(x-a)**2/(2b) - a (a<=x<a+b), x+b/2 (x>=a+b)
      • a=3 and b = 2.
  • Batch size & Crop size
    • Paper: 16 & 240×576
    • Code: 16 & 240x528 (stage 1) and 8 & 240x1248 (stage 2)
  • Finetuning strategy
    • Paper: 640 epochs (lr=0.001 until 300 ep and then lr=0.0001 in the rest)
    • Code: 800 (stage 1) + 8 (stage 2) epochs (lr=0.001 until 400 ep and then lr=0.0001 in the rest)

It would be very helpful if the authors could confirm my summary and provide more information if there are any additional differences.

Thank you very much.

@musuoliniao
Copy link

Hello, i have the same question as yours, and i find that the crop_size of pretrained model is 240*624. As far as i'm concerned, the crop_size somehow affect the accuracy. And my question is: since i want to use the pretrained model, is it better to maintain the crop_size of it while i'm finetuning.

@feihuzhang
Copy link
Owner

Thanks for the summary.
Some minor corrections:

  • Architecture:

    • Paper: GANet-15 (15x 3D-Convs and 5x GA layers from GitHub readme)
    • Code: GANet-Deep (22x 3D-Convs and 9x GA layers from GitHub readme)
    • GANet-15 is similar to GANet-deep but with fewer layers (remove 4 SGA layers of the low resolution and some 3D conv layers).
      GANet-11 does not use the hourglass architecture.
  • Loss y=f(|d-d'|)

    • Paper: Smooth L1: y=0.5*x**2 (x<1), x-0.5 (x>=1)
    • Code: y=x**2/a (x<a), 2x-(x-a)**2/(2b) - a (a<=x<a+b), x+b/2 (x>=a+b)
      • a=3 and b = 2.
  • New loss (code use mixed l2 and threshold loss) is trying to get better threshold error rates in benchmark evaluations.
    1
    2

  • Batch size & Crop size

    • Paper: 16 & 240×576
    • Code: 16 & 240x528 (stage 1) and 8 & 240x1248 (stage 2)
    • released model: 8 & 240x624 and 4 & 240x1248 using four GPUs (22G)
  • Finetuning strategy

    • Paper: 640 epochs (lr=0.001 until 300 ep and then lr=0.0001 in the rest)
    • Code: 800 (stage 1) + 8 (stage 2) epochs (lr=0.001 until 400 ep and then lr=0.0001 in the rest)
    • Released model: 800 epochs.

@GoodStudyDayUpUp
Copy link

Hi Feihu,
Also another difference worth to mention:
In 'GANet_deep.py' line 246 (class DispAgg), you finally used a F.normalize() instead of softmax().
This is also different from the paper using 'soft argmin' for the disparity regression.
Could you please explain briefly?
I found based on your code (with F.normalize()), sometimes the predicted disparity map could have negative disparity values due to that the probability of certain disparity candidates is negative after F.normalize().
Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants