is alpha set wrong? #51

XiaoYunZhou27 · 2021-02-02T23:27:51Z

In the paper, it is said alpha should be 0.99 at the beginning (when global_step is small) and should be 0.999 at the end (when global_step is large), however, in the code:

alpha = min(1 - 1 / (global_step + 1), alpha)

following this, alpha is 0 when global_step is small, and is alpha (this is set as 0.99 from parameters) when global_step is >99. The code seems different what the paper presented. The paper indicates a code of

alpha = max(1 - 1 / (global_step + 1), alpha)

does anyone find issues here?

zyoungDL · 2021-05-29T13:44:43Z

I have the same confusion. What's more, alpha is a function with global_step, so when batch_size change, the step of every Epoch is also change. But in the paper, it said that alpha was relative with ramp up epoch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

is alpha set wrong? #51

is alpha set wrong? #51

XiaoYunZhou27 commented Feb 2, 2021

zyoungDL commented May 29, 2021

is alpha set wrong? #51

is alpha set wrong? #51

Comments

XiaoYunZhou27 commented Feb 2, 2021

zyoungDL commented May 29, 2021