Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Badcase]: 使用Reasoning数据微调2.0成功,但是2.5失败 #1085

Open
4 tasks done
UESTCthb opened this issue Nov 18, 2024 · 1 comment
Open
4 tasks done

[Badcase]: 使用Reasoning数据微调2.0成功,但是2.5失败 #1085

UESTCthb opened this issue Nov 18, 2024 · 1 comment

Comments

@UESTCthb
Copy link

Model Series

Qwen2.5

What are the models used?

Qwen2.5-72B-Instruct

What is the scenario where the problem happened?

Qwen2.5-72B-Instruct用reasoning数据微调,模型无法收敛

Is this badcase known and can it be solved using avaiable techniques?

  • I have followed the GitHub README.
  • I have checked the Qwen documentation and cannot find a solution there.
  • I have checked the documentation of the related framework and cannot find useful information.
  • I have searched the issues and there is not a similar one.

Information about environment

GPU: H100
DATASET: KingNish/reasoning-base-20k
batch size: 1
max_length:int = 3000
batch_size:int = 1
gradient_accumulation_steps:int = 16
log_iter:int = 10
max_lr:float = 2e-3
min_lr:float = 2e-4
warmup_steps:int = 1000

Description

Steps to reproduce

This happens to Qwen2.5-72B-Instruct.
The badcase can be reproduced with the following steps:
max_length:int = 3000
batch_size:int = 1
gradient_accumulation_steps:int = 16
log_iter:int = 10
max_lr:float = 2e-3
min_lr:float = 2e-4
warmup_steps:int = 1000
使用如上参数微调reasoning 推理模型。数据集:KingNish/reasoning-base-20k。相同的代码和参数在Qwen2.0-72B-Instruct上成功,但是Qwen2.5-72B-Instruct无法收敛。backward报错。
并且对于Qwen2.0-72B-Instruct,如果学习率降低一个量级也会出现无法收敛的情况

@jklj077
Copy link
Collaborator

jklj077 commented Nov 19, 2024

what's the error in the backward pass?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants