Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 about learning-rate #59

Open
hanruisong00 opened this issue Oct 25, 2023 · 2 comments · May be fixed by #60
Open

🐛 about learning-rate #59

hanruisong00 opened this issue Oct 25, 2023 · 2 comments · May be fixed by #60

Comments

@hanruisong00
Copy link

Hello, after studying your paper, it has been very inspiring for my work. I still have some questions that I would like to consult with you

For the learning rate issue of training, what your code means is that the loss function is the average value of the losses of all experts. If there are four experts, then for each expert, the actual loss is divided by four, which means that when backpropagation is used to calculate the gradient, it will also be divided by four. Do you need to initially set a learning rate that is four times larger than the single model

@o-laurent
Copy link
Contributor

o-laurent commented Oct 26, 2023

Hi, that's why we love open source!
Thank you very much for this very relevant remark. After a discussion with @alafage, we are confident that you are right, and this may be a difference with Deep Ensembles that we had not anticipated. The first experiment I ran (R18-C10) shows an improved performance for Packed-Ensembles, but this remains to be thoroughly verified. I'll start a branch for this, which raises some technical questions (there may be a more user-friendly way to do this than multiply the lr). Would you like to be a co-author of the first commit with @alafage and me? To do so, I'd just need your email (potentially the "noreply" one from GitHub if you don't want to share your personal address).

@hanruisong00
Copy link
Author

Thank you, it is my pleasure, my email is [email protected], if you have any questions or I can help, you can contact me

o-laurent added a commit that referenced this issue Oct 27, 2023
Co-authored-by: hanruisong00 <[email protected]>
Co-authored-by: Adrien Lafage <[email protected]>
@o-laurent o-laurent linked a pull request Oct 27, 2023 that will close this issue
@o-laurent o-laurent changed the title about learning-rate 🐛 about learning-rate Oct 27, 2023
@o-laurent o-laurent linked a pull request Oct 27, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants