-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reproducibility and domain classifier loss #1
Comments
Hi, I assume you were running the pre-training stage (only GA is applied). First, we do observe the similar behavior that the domain classifier loss quickly reducing to a small value. Note that the training can still improve the performance, just not as good as GA+CA did. Second, there is still some randomness in the training procedure due to adversarial learning. Here we provide the training log for Sim10k --> Cityscapes using VGG backbone (only GA). As far as I can tell, the performance for the final iteration is quite similar. Hope the provided information is helpful! Best, |
Thank you for your answer. The above reported numbers are after da+ca training, but I initialise the second stage training with the weights at 20k iterations. I will try to initialise second stage with the weights at 5k iteration. Could you please comment on the trend for the domain loss in your second stage training ? Thanks |
Hi, I notice two issues that can lead to a small loss value shown in the log file.
Accordingly, the loss value appears to be smaller than it should be in the training log after we round off four decimal places. For now, I believe the issue was caused by the implementation of the logger rather than the coverage of the training. Best, |
Thanks for this update. Yes, I see that for the second stage that all the adversarial losses don't go to zero(p7-p5). Could you please tell me how to run the code without enabling da? Best, |
Thanks for the pointers. Another question that I have is how do you decide to which iteration's weight to take to initialize for the second stage's training. Do you use some validation set? |
In most of our experiments, the pretrained iteration was set as 5k or 10k, except we set 2k for the KITTI dataset since it requires less training iterations. Cheng-Chun |
Excuse me to follow up this issue. @chengchunhsu |
Hi @andgitchang Thanks for bringing it up. Performing single-gpu training without adjusting the learning rate was a pure mistake. For your question, I am not sure about the answer since I do not have direct comparison result for different learning rate. Best, |
@chengchunhsu Thanks for your suggestion. I tried to reproduce vgg16 cityscapes->foggy-cityscapes and ended up reaching AP38.7 at iteration ~6000 (AP38.3 at iteration 8000). The only difference is that the learning rate is 4x larger than your script. It seems that 4x larger lr somehow didn't blow up adversarial training and even increased performance by a large margin. |
@chengchunhsu for sim10k->cityscapes without DA, for how many iterations do you train your network? Simply, setting Best, |
@vidit09 I didn't try the source only experiment, since the proposed GA+CA method was trained from ImageNet pretrained backbone not the source only model. Your modified trainer looks okay. Please also make sure DATASETS.TRAIN/TEST to be sim10k/cityscapes accordingly. Otherwise, clone the origin FCOS repo may be the simplest way to reproduce/verify source only model. |
thanks, @andgitchang for your clarifications but I still don't get how the original FCOS repo will get me source only result as it was not trained on Sim10k. |
@vidit09 It's trivial since both FCOS and EveryPixelMatters use maskrcnn-benchmark. You just need to repeat this steps to the FCOS repo, including path_catelog and the configs. |
@andgitchang I also need to know what is the learning rate used and for how many iterations the network is trained for. |
Hi @andgitchang It seems our training is not sensitive to the learning rate when small batch size is applied. Please let me know if you found anything else. Cheng-Chun |
Hi @vidit09, I cannot see the exact problem from your description. As mentioned by @andgitchang, run the clean FCOS code would be an easier way since our implementation and FCOS share the same code base. I hope this can help you detect the problem. Best, |
Hi Cheng-Chun, Best, |
Hi Cheng-Chun, Best, |
Hi @vidit09, Sorry for the late reply as I was a bit busy this week.
Thanks for letting me know!
The selection for training iteration is a bit tricky in domain adaptation since we should not be able to evaluate the model on the target dataset. Moreover, the randomness that occurs during training could also result in a performance margin on the target domain. For the source-only detector, the training iteration is derived from the original FCOS setting on the cityscapes dataset. |
Thanks Cheng-Chu for the clarifications. |
Hi @chengchunhsu , |
Hi @andgitchang, Now the code is consistent with the equation in the paper. Cheng-Chun |
I set the pretraining iterations as 2k, 5k, 10k and 20k respectively for KITTI-> Cityscapes task. In my opinion, we can select pretraining iteration according to the model's performance on the source domain (KITTI valid set) since we can not be able to evaluate the model on the target dataset. So do you think why my reproduction results are very strange? It looks the best iteration for pretraining randomly appear. I have no idea about it. |
|
HI, I also get much higher source-only results on both sim10k and Kitti. Have you reproduced similar results on a pure fcos baseline? Thank you. By the way, I got a similar source-only result (about 26 to 27) on Cityscapes by setting the lamda of discriminator loss as 0, while we still get very high results on the other two datasets, about 44 for Kitti and 47 for sim10k. All the experiments use ResNet-101. |
Hello,
Thanks for providing the code. I was able to run the code for Sim10k --> Cityscapes and got scores lower than reported. mAP 24.8, mAP @0.5 42.6, mAP @0.75 24.4 , mAP @s 5.3 , mAP @m 27.3 , mAP@L 51.1 . One thing that I noticed is that the domain classifier loss goes down to zero for all the pyramid scales after few iterations. Could you please let me know what kind of trend you see for the domain classifier losses, which could possibly be the reason for me seeing the lower scores?
Thanks
The text was updated successfully, but these errors were encountered: