Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions on using Inception_ResNet_v1 and test accuracy #8

Open
cptay opened this issue Sep 21, 2017 · 4 comments
Open

Questions on using Inception_ResNet_v1 and test accuracy #8

cptay opened this issue Sep 21, 2017 · 4 comments

Comments

@cptay
Copy link

cptay commented Sep 21, 2017

Hi, I am new to deep learning, and thus may not understand your paper fully, hope is all right with you. I tried to implement the batch_hard using Inception_resnet_v1 and trained from scratch using market1501 dataset. The rank 1 cmc is only about 70%. I did not implement re-ranking and augmented test. Do you think this model is able to get rank 1 cmc above 80%?

The second problem I faced was that the test results fluctuate a lot. The rank 1 value can range from 60% to 70%. Can you shade some lights on the test strategy? or point me to papers online? I am using 100 identities to verify the trained model

Thanks!

@lucasb-eyer
Copy link
Member

Hi and welcome to deep learning 😄

Neither of me or @Pandoro have ever used Inception-Resnet, so we can't really know, but I'd expect it to perform similarly to ResNet, so you should definitely be able to get it around or above 80% IMO.

Since you are new, contrary to what many papers want you to believe, the most important thing of all to tune is the learning-rate; it is possible that you'll need a different learning-rate from us because you're using a different type of model.

The next question is why are you only using 100 identities to verify the model? The Market1501 dataset comes with many, many more test identities and a standard split. Specifically, the rank-1 value is not comparable across differently-sized test-sets! Rank-1 is easier as the gallery gets smaller. I highly recommend you use the standard split and the standard evaluation code, so that you can actually compare your results to papers' results. Until you get similar performance to current papers, then you can switch to training on mostly everything for actual deployment (if you don't want to report results in papers).

Unless when you say "test" you actually mean "validate"? Because for validation, it is correct to split off a small part of the training set, and use that only to find what works best, but not to compare to published results.

Finally, our validation results also fluctuated quite a bit (although not as much as you report), this is usually dealt with using a learning-rate decay schedule. It is typical that when the learning-rate is starting to decay, the scores start to "settle" at the higher end of the fluctuations, and the more it decays, the more stable it becomes. We barely had any fluctuation (less than 1%) once we did the learning-rate decaying.

Hope these answers help you understand things better!

@cptay
Copy link
Author

cptay commented Sep 21, 2017 via email

@Pandoro
Copy link
Member

Pandoro commented Oct 5, 2017

Hi there!

Sorry for my late reply, I just arrived back from travels abroad and didn't have time to check my mails before.

Generally speaking, you always tune the complete network unless it is specifically mentioned that only a part of the network is tuned. So as usual, when starting from a pretrained network, we also always tuned all the parameters of the network and not just the last additional layers. I hope you also already tried this at some point and didn't pull your hairs off. ;)

@cptay
Copy link
Author

cptay commented Oct 5, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants