We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
- paddlepaddle: develop - paddlepaddle-gpu: develop - paddlenlp: develop
Pretrain loss在最后一个"epoch"骤降
run_pretrain.py的train_sampler的shuffle是设置为False的,因此数据集的shuffle完全是在causal_dataset里面处理的
当训练需要的samples数量大于数据集能提供的samples数量时,会对数据集进行重复选择, 每一次重复选择叫作一次数据epoch。然而,训练需要的samples数量并不总是等于数据集samples数量的整数倍,最后一个数据epoch可能会进行特殊处理:
PaddleNLP/paddlenlp/data/causal_dataset.py
Lines 502 to 506 in 9f237b4
参考PR:
The text was updated successfully, but these errors were encountered:
你好,建议您可以 修改shuffle这部分的代码,合并一起shuffle。这样应该不会出现突变。
您可以去Megatron那边提问,我不太清楚为什么Megatron 需要设置Last epoch should not be globally shuffled。
Last epoch should not be globally shuffled
Sorry, something went wrong.
DesmonDay
No branches or pull requests
软件环境
重复问题
错误描述
稳定复现步骤 & 代码
Pretrain loss在最后一个"epoch"骤降
run_pretrain.py的train_sampler的shuffle是设置为False的,因此数据集的shuffle完全是在causal_dataset里面处理的
当训练需要的samples数量大于数据集能提供的samples数量时,会对数据集进行重复选择, 每一次重复选择叫作一次数据epoch。然而,训练需要的samples数量并不总是等于数据集samples数量的整数倍,最后一个数据epoch可能会进行特殊处理:
PaddleNLP/paddlenlp/data/causal_dataset.py
Lines 502 to 506 in 9f237b4
参考PR:
The text was updated successfully, but these errors were encountered: