Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is alpha set wrong? #51

Open
XiaoYunZhou27 opened this issue Feb 2, 2021 · 1 comment
Open

is alpha set wrong? #51

XiaoYunZhou27 opened this issue Feb 2, 2021 · 1 comment

Comments

@XiaoYunZhou27
Copy link

In the paper, it is said alpha should be 0.99 at the beginning (when global_step is small) and should be 0.999 at the end (when global_step is large), however, in the code:

alpha = min(1 - 1 / (global_step + 1), alpha)

following this, alpha is 0 when global_step is small, and is alpha (this is set as 0.99 from parameters) when global_step is >99. The code seems different what the paper presented. The paper indicates a code of

alpha = max(1 - 1 / (global_step + 1), alpha)

does anyone find issues here?

@zyoungDL
Copy link

I have the same confusion. What's more, alpha is a function with global_step, so when batch_size change, the step of every Epoch is also change. But in the paper, it said that alpha was relative with ramp up epoch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants